Tag Archives: Data-Science

So, I’m just heading back home after an awesome few days in Budapest, where I was attending the second edition of Craft Conference. I didn’t realise the conference was happening last year, but when my twitter stream was flooded was interesting looking #CraftConf tweets I had a look at the UStream live stream, and then I realised I was missing out! I managed to watch many of last year’s talks via UStream after the conference, and there was some great speakers and great content. As soon as this year’s Craft Conf was announced I bought a ticket, flights and hotel! Having completed the roller-coaster ride that was CraftConf 2015, I’m happy to say that I wasn’t disappointed…

The good…

A quick look down the list of speakers will give you an insight to the amount of tech rockstars that were in the building: Mary Poppendieck, Dan North, Jessica Kerr, Randy Shoup, Trisha Gee, Michael Nygard, Amber Case, Michael Feathers, Sandro Mancuso….need I go on? If you look back at the list, something may strike you as different compared to the majority of other tech conferences – many of the rockstars present are female.

Having first-hand experience of sitting on tech conference program committees myself, I know that it can be very difficult to organise and encourage talk submissions from a diverse group of speakers, of which gender diversity is just one part. I have to tip my hat to the CraftConf organisers for running a tech conference where 25% of the speakers were female, and the fact that they stated in the keynote that this isn’t enough should be applauded (kudos also to the organisers for offering ‘diversity tickets’ to attendees)

The vast majority of talks lived up the billing, and all four keynote sessions were phenomenal (and I don’t use this word lightly!). I left every keynote buzzing, and judging by the amount of chatter and questions from my fellow attendees, so did most other people. Don’t get me wrong, the keynotes weren’t all ‘ra ra – new tech’ or ‘developers, developers, developers’, they were genuinely thought-provoking and some of them boarder-line offensive (in a good ‘call to action’ kinda way). If you go to as many conferences as I do (including some of the bigger names), you’ll again realise how much effort must have gone into organising and planning this content – a big thanks once again to the organisers.

The ‘standard’ sessions running throughout the day were superb (with only a couple of misses detailed below), and the fact that talks were running from 9am to 7pm allowed me to definitely think I was getting value for my money. My general rule for tech conferences these days, is that if I enjoy over 50% of the talks I attend then I’m happy – CraftConf was in the 80-90% range, which is very rare (only the QCon series manages this usually).

During breaks, or when I did skip a talk session, there was always someone to talk to out in the sponsors area or one of the lounges. A tip of the hat again to many of the speakers, who constantly made themselves available and approachable when milling around the conference (and thanks in particular to Adrian Trenaman, Randy Shoup, Sandro Mancuso, Werner Schuster, Trisha Gee, Isra Kaos, Sven Peters, Simon Brown and James Lewis for some great chats).

I’ve listed some of my favourite talks below (in no particular order), and have included links to InfoQ summaries I have written where available (I will try and write more over the coming weeks).

There were also several talks I missed, due to InfoQ commitments or otherwise chatting to interesting people, and I’ll try to watch these back when they are released on UStream:

  • The Ethical Developer – Grady Booch (heard great things about this talk throughout the conference!)
  • DevOps for Everyone – Katherine Daniels (I follow Katherine on twitter, and she talks about some cool stuff)
  • Automating the Modern Datacenter – Mitchel Hashimoto (who doesn’t know what Hashicorp do?)
  • Consensus Systems for the Skeptical Architect – Camille Fournier (I again heard great things about this talk via twitter)

The logistics of the conference was also superb. I found the venue easily enough, registration was a breeze, and moving around the venue was straight-forward (after some initial hunting for the ‘tent’ room, which was actually a tent outside! 🙂 ) Lunches and breaks were well spaced apart and nicely timed, and the food and drink was also excellent (including getting fed for the entire day on the Thursday). The queues for the lunch sessions did get a bit disorganised, but after completing the Paris Marathon and nearly getting shoved to the ground several times when stopping for food and water breaks, I’m starting the think that queueing may be a distinctly-British thing… 😉

The (not so) bad…

I had a few minor grumbles, and to be honest these really were minor. A couple of the talks were too high-level or a bit disorganised, but I only left one early. I’m not going to name and shame people here, but if the conference organisers do ask for feedback then I’m happy be honest. What could help for next year is to indicate the intended audience ‘level’ for a talk, but even then I would expect quite deep content for each session at the conference that I believe CraftConf is trying to be.

I did have a couple of interesting chats with people who mentioned that CraftConf isn’t really a conference about software craftsmanship, and I do agree here – at least in terms of how we define mainstream ‘software craftsmanship’ in London and the USA i.e. focusing on clean code, TDD and outside-in development.

However, I strongly believe that CraftConf is a conference about the software development and delivery craftsmanship, and this is exactly what a generalising specialist like me wants. I care deeply about software development from idea inception through to development, delivery and ultimately validation (thinking about project management, leadership and stakeholder management along the way), and CraftConf hit all of these points nicely for me. The only reason I mention this fact, is that if you are a hardcore London or US-based software craftsman, then don’t come to this conference expecting the type of content you may be used to in typical craftsman events.

I’ve seen a couple of tweets and another blog (including the ever-informative Burning Monk) saying that there was a lack of deep-dive language and functional talks, but I think the conference was all the better for it. This is obviously a subjective opinion, and I enjoy learning about languages and functional as much as the next developer, but I tend to target specialist conferences around these subjects when I want to know more (for example, Scala eXchange or SpringOne 2GX).

For me, the balance of talks at CraftConf was superb, and there could be a danger of diluting the content (and the audience) if the organisers try to make the conference ‘all things to all people’. Just my two pence, but the organisers have run two amazing conferences already, and so I say listen to your heart, not necessarily the advice… (and I appreciate the irony of my advice in this context 🙂 )

And yeah, there was no ugly…

If you’ve read the above content, then you’ve probably figured that I think quite highly of CraftConf, and you would be right. I thoroughly enjoyed my time at the conference, and the combination of speakers, topics and attendees was fantastic. I’ll definitely attempt to head back next year, and this time I’ll keep an eye out for the CFP, as I would love to speak here as well.

The primary meme I took away from this year’s CraftConf is that software development is on the verge of breaking out of the traditional IT shackles, in much the same way computing infrastructure has broken-free and is rapidly evolving with the mainstream adoption of cloud. For anyone familiar with the diffusion of innovation graph, lean principled and business-focused software development is getting dangerously close to the late majority, and this really will shake up the industry (especially the traditional software development shops and developers). Alf Rehn’s and Marty Cagan’s keynotes were great example of this

I’m starting to think that the ‘full-stack’ developer of 2018 will not only have the skills that we expect now, but they will also be business savvy, and know enough statistics and data science to appropriately challenge slow moving executives as to what the expected return on investment will be from the latest (HiPPO) instructions from above… People may say this kind of thing happens now, and it does (and it famously has been in companies like Amazon for quite some time), but I believe this will soon happen in the majority of companies. Don’t forget, if you attended CraftConf or are reading this blog you are already part of a self-selecting audience who aspires to learn more and become better – and you’re in the minority within our industry…

The second meme that I picked up on was is that we are working in a golden age of tooling and process. With the emergence of technologies like cloud and containers, and practices like lean, BDD and TDD, there really is no excuse to be building bad software applications. It may take a bit more thought and design, as mentioned by Mary Poppendieck, Dan North and Jessica Kerr, but it is most definitely possible to build large complex systems that can be validated, tested and deployed with relative ease.

In summary, if you are looking for premium-level conference content at a frankly bargain price (with a fantastic location thrown into the mix), you can’t go wrong with CraftConf. I guarantee you’ll leave with more knowledge than when you arrived, and a whole-lot more questions…

Once again, many thanks to all of the organisers, volunteers and speakers – a great job by all concerned!

Data Science for Business – by Foster Provost and Tom Fawcett


In a nutshell: If you are looking for a simple (but not simplistic) introduction to nearly all of the underlying data science fundamentals then look no further, because this is the book for you! You’ll learn about data-analytic thinking, correlation, segmentation, model fitting/overfitting, similarity (k-NN etc.), clustering (k-means etc.), probability (Bayes et al), mining text, result evaluation etc., and more!

I often work as a software developer, and so like to think that I have a good working knowledge of data science (‘data analytics’, ‘big data’ etc.), which has been achieved through College/Uni education and also through the modern media and blog posts etc. However, I often struggled at times to fully understand, and perhaps more importantly knit together and apply, the core fundamentals of the topic. This book has provided exactly the explanations and ‘glue’ that I required, in that it delivers a very well structured (and paced) introduction and overview of data science, and also how to think in a ‘data-analytics’ manner.

If you preview the book on Amazon with the ‘look inside’ feature then what you see in the table of contents is exactly what you get. Every chapter delivers upon its title (and promised ‘fundamental concepts’), and frequently builds superbly upon topics introduced in early chapters. You’ll move seamlessly from understanding how to frame data science questions, to learning about correlation and segmentation, to model fitting and overfitting, and on to similarity and clustering. With a brief pause to discuss exactly ‘what is a good model’ you’ll then be thrust back into learning about visualising model performance, evidence and probabilities and then how to explore mining text.

The concluding chapters draw upon and summarise how to practically choose and apply the techniques you’ve learnt, and provide great discussion on how to solve business problems through ‘analytical engineering’. There is also some bonus discussion on other tools and techniques that build upon earlier concepts which you might find useful, data science and business strategy, and some general thinking points around topics such as the need to human intervention in data analysis and privacy and ethics.

The book is superbly written and reads very easily, which for the potentially dry topic of data science is worthy of praise alone. The majority of chapters took me each approximately an hour to read, and then another couple of hours to re-read and ponder upon (and sometimes looking at other provided references) to fully understand some of the more complex topics and how everything related together. Each chapter also provided plenty of pointers and experimentation ideas if I wanted to go away and practically explore the topic further (say, with the Mahout framework, or R, or scikit-learn/Pandas etc.). The book could probably be read by dipping in and out of chapters, but I think you’ll get a whole lot more from a cover-to-cover reading.

In summary, this is a superb book for those looking for a solid and comprehensive introduction to data science and data analytics for business, and I’m sure will that even the more experienced practitioners of the art will find something useful here. The book introduces topics in a perfect order, superbly builds your knowledge chapter after chapter, and constantly relates and reinforces the various techniques and tools your learning as it progresses. I wish more text/learning books were written this well!

Click here to buy  Data Science for Business: What you need to know about data mining and data-analytic thinking on Amazon UK (This is a sponsored link. Please click through and help a fellow developer buy some more books!)