Did you know that open source data analytics software was used to reveal the Panama Papers?
The Deep Learning Analytics track at GOTO Copenhagen 2016 provides an introduction to deep learning including its capabilities and the many practical applications along with discussions on how machine learning can be used to discover similar research ideas across domains and disciplines. The track also covered how to do analyze huge amounts of data and apply machine learning algorithms to this data to gain new insights.
Watch the videos from the Deep Learning Analytics track at GOTO Copenhagen 2016 below.
Deep Learning: What It Is and What It Can Do For You
with Diogo Moitinho de Almeida, Programmer, Mathlete and Senior Data Scientist at Enlitic
We start from the basics of deep learning: what it is, how it works, and how to get started, and then move to the most commonly used architectures including convolutional and recurrent networks. We then talk about its capabilities in the context of extremely challenging tasks that the research community has been trying to solve, as well as practical applications.
Illusions of certainty: what our brains can teach us about software engineering
with Julie Pitt, Machine Intelligence Expert and Co-Founder of Order of Magnitude Labs
Death and taxes. These are the only things certain in life, as they say. And yet, software developers rely upon a third certainty: computers do what we tell them to and nothing more. When designing software systems, we tend to describe them in terms of “the happy path.” The tests we write reassure us that known scenarios achieve known outcomes. We account for failures as if they happen one at a time. When it comes to software systems in the wild, we often underestimate our uncertainty about their behavior, resulting in outages and late night conference calls. Why do we do this, and why is it a problem? What can we do about it? It turns out that looking at how our brains work provides some clues. Learn the full story during this talk.
How the Investigative Journalists of the ICIJ used modern Open Source Technologies to unearth the stories of the Panama Papers
with Michael Hunger, Graph Addict at Neo4j
The biggest leak in journalistic history has not only been mind blowing for everyone but also challenging for the team of journalists and developers known as the ICIJ. With more than 11M documents sizing 2.6TB of information, it is truly impressive that a small team of 3 developers could support more than 400 journalists in a year’s worth of investigative work.
This became possible through the efficient use of open source technology for scanning and extracting text and metadata from the documents. The biggest difference though made the power of a graph database to connect the people, companies and accounts revealed in the investigation.
Especially for the non-technical journalists, the ability to unearth all those connections “was like magic”. Collaborating in research they benefited from each other’s work to see the bigger picture grow more interesting every day.
In this talk, I detail the process and the technologies used by the journalists for their investigative work, including Apache Solr Apache Tika, and Neo4j. Then, I focus on their work with Neo4j, the data model they developed and the types of queries and interactions that helped them to grow their understanding. We discuss how tools for visual graph exploration and search enable even non-technical users to benefit from working with large amounts of connected data.
Using the officially published dataset of 3.4M records, we demonstrate how new insights in your existing, disconnected data are just one graph query away.
Exploring StackOverflow Data
with Evelina Gabasova, F# Expert and Researcher at University of Cambridge
When you’re stuck while programming – who you gonna call? StackOverflow! It’s an invaluable source of daily help to many. Interestingly, you can also download the entire data dump of StackOverflow and let machine learning loose on the dataset. In the talk I’ll look at what we can learn from the crowdsourced knowledge of developers worldwide. Meanwhile, you will also learn about ideas behind some machine learning algorithms that can give us insights into complex data. I will use a combination of statistical computing language R with functional language F# to show how you can easily access and process large-scale data the functional way.
Discovering similar Research Ideas using Semantic Vectors and Machine Learning
with Mads Rydahl, Strategic Advisor at Unsilo
UNSILO works with leading Scientific Publishers to enrich their content and improve discoverability across domains and disciplines. Our discovery tools capture trending ideas and novel concepts as they emerge, and they help researchers find articles that describe parallel research of similar ideas across different domains and disciplines.
In this talk I will present our vision, the problems we are trying to solve in Science, and some of the platforms and tools we use. The talk will include examples of use of text mining tools and semantic vectors. And to put these ideas into practice, I will describe two of the major challenges we are currently trying to solve, and outline the future direction of what we call Text Intelligence.