Bitjuice Consulting is all about squeezing the useful information out of raw data. Whether it’s cleaning a messy data source, creating visualisations to highlight a trend, or learning a predictive model, Bitjuice is here to help. With experience in both the corporate and academic worlds, we can deliver cutting edge solutions on time and on budget.
“Data Science” is a broad term, encompassing a huge range of activities. Here are a few of our specialities:
Product Analytics is a simple concept – take field data and logs from products, and analyse it to gain a better understanding of how they are used. The real challenge is in the diversity of tasks required to move through the workflow:
Data Preparation: Preparing and “cleaning” the data usually requires intense, detailed technical work. Extracting data from files, databases and hardware devices is usually messy and challenging work. Writing scripts, understanding technical aspects of the product, and translating this into cleaned data for analysis often takes the majority of time on a project.
Data Analysis: Once the data has been cleaned, analysis can range from plotting a few graphs in excel, to more in depth statistical analysis, or applying machine learning techniques such as clustering. More than just writing code, this requires an intimate understanding of how data works. Understanding concepts such as overfitting and uncertainty can be the difference between gaining insight, and developing a dangerously inaccurate view of what the data means.
Telling the story is really the most important part of the process. It’s all well and good to have an analysis of the data, but what does it mean for the business? Does the product design need to change? Have gaps been identified in marketing or training on the product? At Bitjuice Consulting, we understand that it’s not enough to be clever at manipulating data; when we do a project, we ensure we communicate the results effectively, so you are equipped to make the right decisions.
Machine Learning and Statistics
Machine Learning is the practice of taking a body of data, and using a variety of algorithms to learn a predictive model. When exposed to new data (generated by the same process), the machine learning algorithm should be able to make useful and accurate predictions about it. Machine Learning comes hand in hand with Statistics. At Bitjuice, we work with whatever is best suited to the problem at hand – whether it’s a simple linear regression model, a well used machine learning technique (like support vector machines and k-means clustering), or the latest developments in research (such as dictionary learning and sparse coding).
Example applications include:
Computer Vision: Although images and video are “just” a 2D data input, computer vision is a significant field of research in its own right. There are a large number of algorithms that are particularly suited to interpreting visual data, for tasks such as segmentation, feature extraction, object recognition and scene analysis. For some background on our experience in this area, see the research page.
Audio Processing: Similar to Computer Vision, audio data comes with a slew of tricks and special algorithms that can be used to clean up a noisy signal before more standard machine learning is applied. With experience in the implantable hearing industry (where getting a clean signal, and making sense of what’s going on in the sound environment is important!), we are well equipped to tackle any problem involving sound.
Humans are visual creatures. Our brains are fantastically adapted to recognise patterns, movement, colours and objects in the visual world. When the goal is to translate a huge, multi-dimensional data set into something a person can understand, the answer usually involves some kind of visual representation. Whether it’s part of a larger project, or a standalone piece, we have experience creating:
Using python’s matplotlib package, we can create anything from a clean, simple, publication quality graph to illustrate a point, to almost any kind of 2D visualisation you can dream up. Take a look at the example gallery on the matplotlib webpage for some inspiration.
When you want something really special, and start throwing in words like “interactive” and “animation”, we turn to the HTML5 compliant D3.js library. Although it takes longer to do simple things (a bar chart is not trivial to create!), the result can be truly beautiful to behold. When browsing the D3.js gallery, it’s important to remember that all the visualisations are generated from data. This means that when the data is updated, the visualisation is too (unlike many of the ones you’ll see on the web that have been hand-crafted using graphic design tools).
As well as working with data, performing analysis, and writing one off reports, we can also develop software applications for ongoing use. We have experience in the development of both web-based and standalone applications, primarily working with tools such as Twitter Bootstrap, Django, PostgreSQL, Qt and the scientific python suite of tools.