Platform update!

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

We have been working on a few new things and we are happy to launch them in our latest update.

  • We upgraded the terminal to butterfly 3.1.5 – it is prettier 😉
  • We added a “private share” option – you can now publish your project and send a private link to a friend or a co-worker
  • We changed the privileges in Python, R, Julia images – you now have sudo rights to “make” for compiling custom libraries – it should be much easier to compose your custom image with all language and system packages
  • We put the “publish” and “packages” buttons at the top, as you requested!
  • The dashboard has a “running on top” option that filters your projects – the ones that are still running are on top

We also fixed a few things

  • Our IDE slowed down when it had to much output – now huge logs from your machine learning algorithms won’t have such an effect
  • The “closed connection” error when launching terminal in Julia and R
  • The drag & drop file upload to IDE bugs

Our last update also had a few things that you might have missed:

  • There are all new cards in the Explore page with better search and filtering
  • We added a GitHub login option
  • There is now a public user profile if you want to showcase your projects
  • We added support for python 3.5 projects
  • Python 3.5 along with the SciPy
  • We updated the markdown support in readme files and public project page

We have also switched do the new Meteor 1.4 framework so it should be faster 🙂   If there is anything you would like to use on our platform just send us and email or contact us on our facebook page.

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

The World Bank GDP Analysis using Pandas and Seaborn Python libraries

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

Pandas and Seaborn are one of the most useful data science related Python libraries. The first one provides an easy to use and high-performance data structures and methods for data manipulation. The latter is build on top of matplotlib and provides a high-level interface for drawing attractive statistical graphics. How do they work?

Let’s check it out using World Bank GDP data from 10 central European countries – Poland, Germany, Belarus, the Czech Republic, the Slovak Republic, Hungary, Estonia, France, Ukraine and the United Kingdom.

Just want to run the project? You can find it here: The World GDP Analysis Project

What are we looking for?

The question – How far in economic development eastern Europe countries are relative to developed countries like Germany and France?

To answer it we need to analyze four GDP factors – GDP per capita (US$), GDO per capita growth (annual %), GDP growth (annual %) and GDP (current US$).

The data from the World Bank (from the World Development Indicators website to be exact) are in an open format and have good history records for many countries that include a number of economic and social indicators.

We chose the years 1990 – 2016 because only these were available for the selected indicators.

You can find the data here.

The code

First, we load the data from the CSV file. Then we remove the last 5 lines, because they contain empty values and information about the date of the last data update. In addition, we have to remove the column with the year 2016, because, as it turned out it is empty (no data). “gdp.replace” is responsible for the replacement of two dots, symbolizing the empty NaN.

In the course of further work with DataFrame I received mysterious errors and at first, I was not able to determine what was wrong. After some time I decided to check the types of the individual columns:

To my surprise dates from 1990 to 1995 didn’t have the data type float64 only object, so I decided to be sure all the columns of years to convert to numeric values. For this purpose, I select  columns from 4 up to the end (that is, all of the years) and with use of “apply” method ‘I applied the function “pd.to_numeric“. It converts all years to a floating point number.

In each row, was the name of the country, its code, the name of a series of data from the World Bank, its code, and in subsequent columns the years. Such arrangement of the data was not too comfortable so I decided to reindex the table using the functions “pivot_table

This has changed dataframe from form:

worldbank pandas dataframe

To this one.

worldbank pivoted table in pandas

That way I can pull any economic indicator and immediately have all the countries along with all the years.

Now I can easily visualize 4 selected indicators. For nicer graphs import Seaborn and set the color palette so that each line on the graph was plotted with a different color. Try comparing charts with and without Seaborn.

Drawing directly with the pandas is really simple – just for our pivot table choose the interesting indicator, then transpose the data (function .T) and plot (, plot ‘).

The first two charts

  1. GDP (current US$), data from World bank

GDP plot eastern europe

2. GDP per capita, data from World bank

GDP per capita pandas plot eastern europe

Let’s try to perform a simple regression from the GDP data to see if there is a chance that one day we can catch up with Germany. This time we will use the “lmplot” function from the Seaborn library, except that the data must lead to a form of time series.

From the data in the form of a table with countries as columns, we need to create a table in which we will have only three columns [years, the country GDP]. We do this through a series of operations, the removal of the index, because our table at the beginning of the year is indexed (unique rows), changes of the name of the column. The key operation here is the “melt” function that transmits the data from the column and adds them into the next rows. So that we are able to make the following transformation. The attached images omitted part of the columns and rows but I hope its clear.


We should get a result similar to this:



Be sure to check also how ca you launch the project in PLON after importing it into your PLON account.

Important link

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

PLON Projects – Introduction to Julia

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn

What is Julia? It’s a language designed for data science and technical computing and its main features are simplicity and speed. It was created by Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B. Shah. Compared to other languages used for similar goals it’s one the fastest solutions out there and thanks to the active community you can use a growing number of packages. Although rankings like the TIOBE Index or RedMonk Programming Language Ranking state that Julia popularity is below the top50 languages, it is surely something worth trying – especially if you are interested in getting the best out of your datasets. We use Julia a lot and in our opinion – you should at least give it a try 🙂

To get you started, we have prepared an introduction to the basics of Julia language which includes:

  1. Datatypes and Operators
  2. Variables and Collections
  3. Linear algebra
  4. Functions
  5. Control Flow
  6. Types
  7. Multiple-Dispatch
  8. Input and Output

>> Introduction to Julia <<


How to start experimenting with Julia in PLON? Here’s a short video:

More information:


Have feedback? Write it in the comments below, on our facebook page or send it to

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedIn