coding beacon

[programming & visualization]

Monthly Archives: November 2013

Natural Language Processing and Data Mining Links

Data Mining Fusion: Graphing and Charting

Note: My personal choice fell on Shiny, as that the most flexible front-end for interactive 2-D and 3-D visualization for my purposes ( coupled with R (a language for high performance statistical computing).

Visualization Libraries


free and open source software for statistical computing and graphics (with numerous visualization packages)

Shiny (a notable dynamic visualization extension for R)

ggplot2 (a notable static visualization extension for R)

RStudio (IDE for R)

StatET (Eclipse-based IDE for R)

scatter3d: (downloadable from within R IDE)

scatterplot3d: (downloadable from within R IDE)

cloudplot: (downloadable from within R IDE)

rgl:… is a 3D visualization system based on OpenGL. It provides a medium to high level interface for use in R, currently modelled on classic R graphics, with extensions to allow for interaction. An rgl device at its core is a real-time 3D engine written in C++. It provides an interactive viewpoint navigation facility (mouse + wheel support) and an R programming interface.


GUI for data mining using R


open source software (license) built on the Eclipse/RCP platform in order to scale to address a wide range of applications and to benefit from the workbench and advanced plugin system implemented in Eclipse

datamining / exploration workflows


the world-leading open-source system for data mining


very similar to OpenFrameworks (, but written not in C++ but in Java


matplotlib: /* outstanding */

python, (the author passed away in 2012, regretfully). matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell (ala MATLAB®* or Mathematica®†), web application servers, and six graphical user interface toolkits.


Javascript charting library for jQuery

Highcharts: /* outstanding */

Highcharts is a charting library written in pure HTML5/JavaScript, offering intuitive, interactive charts to your web site or web application. Highcharts currently supports line, spline, area, areaspline, column, bar, pie, scatter, angular gauges, arearange, areasplinerange, columnrange, bubble, box plot, error bars, funnel, waterfall and polar chart types.

Highstock: /* outstanding */ /* free for non-commercial use */

Highstock is solely based on native browser technologies and doesn’t require client side plugins like Flash or Java. Furthermore you don’t need to install anything on your server. No PHP or ASP.NET. Highstock needs only two JS files to run: The highstock.js core and either the jQuery, MooTools or Prototype framework. One of these frameworks is most likely already in use in your web page.



[plugins]:[draggable points]:


a library for making high-quality scientific graphics under Linux and Windows; a library for the fast data plotting and data processing of large data arrays; a library for working in window and console modes and for easy embedding into other programs; a library with large and growing set of graphics.

* 11 November 2013. New version (v.2.2) of MathGL is released. There are speeding up, new plot kinds and data handling functions, new plot styles, masks for bitmap output, wx-widget, Lua interface.

* Javascript interface was developed with support of $DATADVANCE company.

gnuplot interfaces in ANSI C:

gnuplot_i talks to a gnuplot process by means of POSIX pipes. This implies that the underlying operating system has the notion of processes and pipes, and advertizes them in a POSIX fashion. Since Windows does not respect this standard, this module will not compile on it, unless you have a compiler that offers a popen call on that platform or simulates it.

[real time data streams]:


Kst is the fastest real-time large-dataset viewing and plotting tool available (you may be interested in some benchmarks) and has built-in data analysis functionality. Kst is very user-friendly (both the community and the program itself!). Kst contains many powerful built-in features and is expandable with plugins and extensions (see developer information in the “Ressources” section). Kst is licensed under the GPL, and is as such freely available for anyone. What’s more, as of 2.0.x it is available on all of the following platforms: Microsoft Windows, Linux, Mac OSX. Note that KDE libraries are an optional dependency (i.e. you can run Kst without KDE, but you get additional features when running on a platform with KDE). See the “Downloads” section for pre-compiled executables or the sources.

Gigasoft ProEssentials:

Visual Studio.Net, ActiveX, C++ MFC

FindGraph: chart fx:

a range of platforms, including c++, java, html5, com, silverlight


visual basic and c++, good free help file, Chart control for Windows Forms

Nevron: xygraph:

delphi (source code available)


very similar to Processing (, but written not in Java but in C++

ofxChart is a custom add-on for OpenFrameworks C++ library. It allows adding various 2d and 3d charts to your projects.

(*GUI controls: ofxUI:, )

(*GUI controls: ofxRemoteUI: )


.NET, Java, ASP, COM,VB, PHP, Perl, Python,Ruby, ColdFusion, C++


c/c++ (includes CINT, c/c++ interpreter)


(noncommercial is free) C, Fortran 77 and Fortran 90/95. For some operating systems, the languages Perl, Python, Java and the C/C++ interpreter Ch are also supported


freeware, C, C++


nPlot – a minimalistic data analysis application (c?/c++?), GLib was used @ some point

NPlot charting library:

NPlot (formerly known as scpl) is a free charting library for .NET. It boasts an elegant and flexible API. NPlot includes controls for Windows.Forms, ASP.NET and a class for creating Bitmaps. A GTK# control is also available.


just an example of code using plotutils


java, javascript, a simple cross mode GUI library for the Processings.


java, processing, 2d 3d graphs


java, processing, 2d graphs (full interactive capabilities of Processing)


java, processing


provides a high-level, simple scenegraph for Processing, modeled on the API for the scenegraph and display list implemented by ActionScript 3. Nest is targeted toward developers familiar with AS3, who wish to organize on-screen objects in a display list hierarchy. As with the AS3 display list, Nest establishes parent-child relationships, applies parent transformations to children, and allows easy manipulation of on-screen objects through member variables such as x, y, rotation, and scale. In addition to the scenegraph, Nest also includes an event-based communication system (built on the Observer pattern as implemented by Java’s Observer interface), and some minimal UI components.


dashboard/visualization, connectable to any platform / database / text data import


delphi, c++ (both $$$ & free opensource)


TAChart, a similar to teechart open-source implementation, is bundled with the LCL of the Lazarus IDE (free pascal)

Orange: /* outstanding for multidimensional data */

python (visualization & datamining) (freeware opensource)


scientific computation and visualization environment,  BeanShell, Jython (the Python programming language), Groovy and JRuby (Ruby programming language). This brings more power and simplicity for scientific computation. The programming can also be done in native Java. Finally, symbolic calculations can be done using Matlab/Octave high-level interpreted language.


Professional Open-Source Software KNIME [naim] is a user-friendly graphical workbench for the entire analysis process: data access, data transformation, initial investigation, powerful predictive analytics, visualisation and reporting. The open integration platform provides over 1000 modules (nodes), including those of the KNIME community and its extensive partner network.

RPy and RPy2:

rpy2 is a redesign and rewrite of rpy. It is providing a low-level interface to R, a proposed high-level interface, including wrappers to graphical libraries, as well as R-like structures and functions.


an open source visualization program for exploring high-dimensional data. It provides highly dynamic and interactive graphics such as tours, as well as familiar graphics such as the scatterplot, barchart and parallel coordinates plots. Plots are interactive and linked with brushing and identification. GGobi is fully documented in the GGobi book: “Interactive and Dynamic Graphics for Data Analysis”.




Based on PyQwt (plotting widgets for PyQt4 graphical user interfaces) and on the scientific modules NumPy and SciPy, guiqwt is a Python library providing efficient 2D data-plotting features (curve/image visualization and related tools) for interactive computing and signal/image processing application development. Guiqwt plotting features are quite limited in terms of plot types compared to matplotlib. However the currently implemented plot types are much more efficient.

Enthought Tool Suite:

The Enthought Tool Suite (ETS) is a collection of components developed by Enthought and our partners, which we use every day to construct custom scientific applications. It includes a wide variety of components, including: an extensible application framework, application building blocks, 2-D and 3-D graphics libraries, scientific and math libraries, developer tools. The cornerstone on which these tools rest is the Traits package, which provides explicit type declarations in Python; its features include initialization, validation, delegation, notification, and visualization of typed attributes.


Chaco is a Python plotting application toolkit that facilitates writing plotting applications at all levels of complexity, from simple scripts with hard-coded data to large plotting programs with complex data interrelationships and a multitude of interactive tools. While Chaco generates attractive static plots for publication and presentation, it also works well for interactive data visualization and exploration.


Mayavi seeks to provide easy and interactive visualization of 3-D data. It offers: (1) An (optional) rich user interface with dialogs to interact with all data and objects in the visualization. (2) A simple and clean scripting interface in Python, including one-liners, or an object-oriented programming interface. Mayavi integrates seamlessly with numpy and scipy for 3D plotting and can even be used in IPython interactively, similarly to Matplotlib. (3) The power of the VTK toolkit, harnessed through these interfaces, without forcing you to learn it. (4) Additionally Mayavi is a reusable tool that can be embedded in your applications in different ways or combined with the Envisage application-building framework to assemble domain-specific tools.


Macros are a quick way to customize and extend Canopy. They can help you to automate tasks which are frequent or complicated.

Qwt:, (example: )

Qwt 6.1 might be usable in all environments where you find Qt. It is compatible with Qt4 ( >= 4.4 ) and Qt5. (Curve Plots, Scatter Plot, Spectrogram, Contour Plot, Histogram, Dials, Compasses, Knobs, Wheels, Sliders, Thermos)


wxChart is a control that allows you to create charts. At the moment the type of charts available are Bar, Bar 3D, Pie and Pie 3D. Other chart types will be added soon.

Anti-Grain Geometry(AGG): /*outstanding*//*latest update was in 2007, author might be busy, perhaps it’s time to move the project to github*/

Anti-Grain Geometry (AGG) is an Open Source, free of charge graphic library, written in industrially standard C++. The terms and conditions of use AGG are described on The License page. AGG doesn’t depend on any graphic API or technology. Basically, you can think of AGG as of a rendering engine that produces pixel images in memory from some vectorial data. But of course, AGG can do much more than that. The ideas and the philosophy of AGG are: Anti-Aliasing. Subpixel Accuracy. The highest possible quality. High performance. Platform independence and compatibility. Flexibility and extensibility. Lightweight design. Reliability and stability (including numerical stability).

wxArt2D :

WxArt2D is a library for 2D graphical programming. WxArt2D is build on top of the wxWidgets Library. It is build around a document View Framework, and has several graphical drawing context classes. You can display (multiple and different levels) views of a document filled with a hierachy of graphical objects. Tools allow you to zoom, drag, edit etc. the objects on the view.

wxMaxPlot: http://wxmathplot.sourceforge.net

a library to add 2D scientific plot functionality to wxWidgets. It allows to embed inside your program a window for plotting scientific, statistical or mathematical data, with additions like legend or coordinate display in overlay. Multi-platform: runs everywhere wxWidgets does.

[py plotting tools]:

… might need to use Boost.Python:, a good example of its use:

ComponentOne Chart:

ComponentOne Studio for WinRT XAML includes UI controls for data visualization, layout and input. Based on the ComponentOne Silverlight controls and designed to enhance the rich user experience.


The idea is to provide a pure ansi/iso c++ plot class (called PPlot). Of course no actual plotting can be done in c++. The connection to the graphical world (widgets) is done via an abstract class that you have to implement. The class is called Painter and asks you to implement things like  draw a line from (x1,y1) to (x2,y2) draw a text at position (x,y) calculate width of a text when drawn on screen. (implemented the Painter class in QT (a nice c++ framework) and Zinc (an obscure API used in real time computing).)


C++11 plotting library for console apps

TeeChart Pro:

.NET, Java, ActiveX / COM, PHP and Delphi VCL / FireMonkey controls for business, Real-time, Financial, Scientific and Mobile applications.


Graph-tool is an efficient Python module for manipulation and statistical analysis of graphs (a.k.a. networks). Contrary to most other python modules with similar functionality, the core data structures and algorithms are implemented in C++, making extensive use of template metaprogramming, based heavily on the Boost Graph Library. This confers it a level of performance which is comparable (both in memory usage and computation time) to that of a pure C/C++ library. (Many algorithms are implemented in parallel using OpenMP)


Graphviz – Graph Visualization Software. Drawing graphs since 1988


Cairo is a 2D graphics library with support for multiple output devices. Currently supported output targets include the X Window System (via both Xlib and XCB), Quartz, Win32, image buffers, PostScript, PDF, and SVG file output. Experimental backends include OpenGL, BeOS, OS/2, and DirectFB. Cairo is designed to produce consistent output on all output media while taking advantage of display hardware acceleration when available (eg. through the X Render Extension). The cairo API provides operations similar to the drawing operators of PostScript and PDF. Operations in cairo including stroking and filling cubic Bézier splines, transforming and compositing translucent images, and antialiased text rendering. All drawing operations can be transformed by any affine transformation (scale, rotation, shear, etc.) Cairo is implemented as a library written in the C programming language, but bindings are available for several different programming languages. Cairo is free software and is available to be redistributed and/or modified under the terms of either the GNU Lesser General Public License (LGPL) version 2.1 or the Mozilla Public License (MPL) version 1.1 at your option.



Gadfly is a plotting and data visualization system written in Julia. It’s influenced heavily by Leland Wilkinson’s book The Grammar of Graphics and Hadley Wickham’s refinment of that grammar in ggplot2. It renders publication quality graphics to PNG, Postscript, PDF, SVG, and Javascript. The Javascript backend uses d3 to add interactivity like panning, zooming, and toggling.




Data-Driven Documents (D3):

D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.


Stock market, commodity and technical analysis charting app based on the Qt toolkit. Extendible plugin system for quotes and indicators. Portfolio, back testing, chart objects and many more features included.

[ta-lib]:  http://ta-lib.org

[adobe flash online charting]:

Visualization Library:

C++ (site offline)


WxArt2D is a library for 2D graphical programming. WxArt2D is build on top of the wxWidgets Library

Simple Directmedia Layer (SDL):

Simple DirectMedia Layer is a cross-platform development library designed to provide low level access to audio, keyboard, mouse, joystick, and graphics hardware via OpenGL and Direct3D.

Development Libraries: Windows: (Visual C++ 32/64-bit) SDL2-devel-2.0.1-mingw.tar.gz (MinGW 32/64-bit), Linux.


Gosu is a 2D game development library for the Ruby and C++ programming languages, available for Mac OS X, Windows, and Linux.

Data Visualization References

[software list]:

[softpedia charting software list]:

[30 best tools for data viz’n]:

[blog dedicated to graphs & charts]:

Free Technical Analysis Libraries

TA-Lib : Technical Analysis Library:

Multi-Platform Tools for Market Analysis … TA-Lib is widely used by trading software developers requiring to perform technical analysis of financial market data., Includes 200 indicators such as ADX, MACD, RSI, Stochastic, Bollinger Bands etc… (more info), Candlestick pattern recognition, Open-source API for C/C++, Java, Perl, Python and 100% Managed .NET. Free Open-Source Library. TA-Lib is available under a BSD License allowing it to be integrated in your own open-source or commercial application.


Numerical Analysis Software

GNU Radio:

* uhd_fft – A very simple spectrum analyzer tool

* Extending GNU Radio in C++:

* GNU Radio Companion (GRC) is a graphical tool for creating signal flow graphs and generating flow-graph source code

Data Mining Resources


Weka 3: Data Mining Software in Java, Weka—Machine Learning Software in Java


spss alternative opensource

Scicos :

Scicos is a graphical dynamical system modeler and simulator developed in the Metalau project at INRIA, Paris-Rocquencourt center. With Scicos, user can create block diagrams to model and simulate the dynamics of hybrid dynamical systems and compile models into executable code. Scicos is used for signal processing, systems control, queuing systems, and to study physical and biological systems. New extensions allow generation of component based modeling of electrical and hydraulic circuits using  the Modelica language.

[lists] /*outstanding!*/ //the author of the data in the previous link

The Art of Unix Programming


The Kanban method

(ripoff from wikipedia & other sources)

translates roughly as “signal card”

K. is a method for managing knowledge work with an emphasis on just-in-time delivery while not overloading the team members. In this approach, the process, from definition of a task to its delivery to the customer, is displayed for participants to see and developers pull work from a queue.

The Kanban method is rooted in four basic principles:

  • Start with what you do now
  • Agree to pursue incremental, evolutionary change
  • Respect the current process, roles, responsibilities & titles
  • Leadership at all levels

Six core practices

Anderson identified five core properties that had been observed in each successful implementation of the Kanban method.[2] They were later relabeled as practices and extended with the addition of a sixth.

1. Visualise

The workflow of knowledge work is inherently invisible. Visualising the flow of work and making it visible is core to understanding how work proceeds. Without understanding the workflow, making the right changes is harder.
A common way to visualise the workflow is to use a card wall with cards and columns. The columns on the card wall representing the different states or steps in the workflow.

2. Limit WIP
Limiting work-in-process implies that a pull system is implemented on parts or all of the workflow. The pull system will act as one of the main stimuli for continuous, incremental and evolutionary changes to your system.
The pull system can be implemented as a kanban system, a CONWIP system, a DBR system, or some other variant. The critical elements are that work-in-process at each state in the workflow is limited and that new work is “pulled” into the new information discovery activity when there is available capacity within the local WIP limit.

3. Manage flow
The flow of work through each state in the workflow should be monitored, measured and reported. By actively managing the flow the continuous, incremental and evolutionary changes to the system can be evaluated to have positive or negative effects on the system.

4. Make policies explicit
Until the mechanism of a process is made explicit it is often hard or impossible to hold a discussion about improving it. Without an explicit understanding of how things work and how work is actually done, any discussion of problems tends to be emotional, anecdotal and subjective. With an explicit understanding it is possible to move to a more rational, empirical, objective discussion of issues. This is more likely to facilitate consensus around improvement suggestions.

5. Implement feedback loops
Collaboration to review flow of work and demand versus capability measures, metrics and indicators coupled with anecdotal narrative explaining notable events is vital to enabling evolutionary change. Organizations that have not implemented the second level of feedback – the operations review – have generally not seen process improvements beyond a localized team level. As a result, they have not realized the full benefits of Kanban observed elsewhere.

6. Improve collaboratively, evolve experimentally (using models and the scientific method)
The Kanban method encourages small continuous, incremental and evolutionary changes that stick. When teams have a shared understanding of theories about work, workflow, process, and risk, they are more likely to be able to build a shared comprehension of a problem and suggest improvement actions which can be agreed by consensus.
The Kanban method suggests that a scientific approach is used to implement continuous, incremental and evolutionary changes. The method does not prescribe a specific scientific method to use.


Kanban is a new technique for managing a software development process in a highly efficient way. Kanban underpins Toyota’s “just-in-time” (JIT) production system. Although producing software is a creative activity and therefore different to mass-producing cars, the underlying mechanism for managing the production line can still be applied.

A software development process can be thought of as a pipeline with feature requests entering one end and improved software emerging from the other end.

Inside the pipeline, there will be some kind of process which could range from an informal ad hoc process to a highly formal phased process. In this article, we’ll assume a simple phased process of: (1) analyse the requirements, (2) develop the code, and (3) test it works.


How to get started with Kanban…

1. Map your value stream (your development process).
Where do feature ideas come from? What are all the steps that the idea goes through until it’s sitting in the hands of the end-user?
2. Define the start and end points for the Kanban system.
These should preferably be where you have political control. Don’t worry too much about starting with a narrow focus, as people outside the span will soon ask to join in.
3. Agree:
Initial WIP limits and policies for changing or temporarily breaking them
Process for prioritising and selecting features
Policies for different classes of service (e.g. “standard”, “expedite”, “fixed delivery date”). Are estimates needed? When choosing work, which will be selected first?
Frequency of reviews
4. Draw up a Kanban board.
All you need is a whiteboard and some Post-It™ notes. Don’t spend too much time making it look beautiful because it will almost certainly evolve.
5. Start using it.
6. Empirically adjust.