A brief introduction to D3.js.
Those who used both Python or R to perform their statistical inference, or build the machine learning models know, that these environments together with their rich libraries can enable the practitioner to do anything they want. They are great for prototyping and performing experiments. Python in conjunction with Django or Flask can be used to build a fully contained web applications. R with it’s Shiny environment can be used to build interactive dashboards. The shortcoming is that these applications have to run on the server and the resources has to be purchased before the content is shared to a wider audience. One of the solutions to this is to use the audience and their computing power to transform and present the data.
I have some experience in using R and Python. The main issue I have encountered is that making a move from the experimental phase to publishing is always difficult. After the project is concluded many times it goes back to the drawer, because publishing it would require paying for the computational power or the cloud providers.
D3.js is a JavaScript framework used to create data driven interactive documents. It is not a graphical library, that can be used to create the interactive content quickly. But once it is mastered, it becomes invaluable help in publishing interactive documents. D3.js is used to create many visually appealing and informative visualisations Observable.
Here, we will built an interactive D3.js widget which plots an arbitrary function and calculates the integral under the selected part of the plot. We will use the probability density function of the normal distribution and let the widget calculate the area under two specified x values i.e. the probability.
function f(x) {
// standard normal pdf
return (1/Math.sqrt(2*Math.PI))*Math.exp(-x*x/2);
}
\[\begin{equation} p(x; 0, 1) = \frac{1}{\sqrt{2\pi}}\exp\left( -\frac{x^2}{2} \right) \end{equation}\]
The CDF of the normal distribution does not have a closed form and is defined as:
\[ \text{CDF} = \frac{1}{2} \left[ 1 + \text{erf} \left( \frac{x-\mu}{\sigma \sqrt{2}} \right) \right] \] Where,
\[ \text{erf}\ z = \frac{2}{\sqrt{\pi}} \int_0^z e^{-t^2}\ dt \]
The usual method to calculate the CDF would involve using the statistical software such as R, e.g. \(P(-1 < X < 1) =\) pnorm(1) - pnorm(-1) =
0.6826895, or can be read from the statistical tables.
In this example we use the trapezoidal rule to calculate the approximate area under the curve:
\[ \int_a^b f(x)\ dx \approx \Delta x \left( \sum\limits_{k=1}^{N-1} x_k + \frac{f(x_N)+f(x_0)}{2} \right) \]
D3.js can manipualte the whole DOM document structure, but we are interested in plotting by using the SVG format. First we have to add <svg></svg>
block to the DOM structure as a child of <body>
element.
var svg = d3.select("body").append("svg")
.attr("width", 600).attr("height", 600);
Here we selected the <body>
element and added <svg>
as one of it’s children. We also set the width to 600px and height to 600px.
Another very important elements of D3.js scripts are scales
var scalerLin = d3.scaleLinear()
.domain([0, 1])
.range([10, width-10])
.nice();
var scalerLog = d3.scaleLog()
.domain([1e-3, 1e6])
.range([10, width-10])
.nice();
var scalerLinInv = d3.scaleLinear()
.domain([0, 1])
.range([width-10, 10])
.nice();
var axisLin = d3.axisTop(scalerLin);
var axisLog = d3.axisTop(scalerLog);
var axisLinInv = d3.axisTop(scalerLinInv);
.append("g").attr("transform", `translate(0, ${height/4})`).call(axisLog);
svg.append("g").attr("transform", `translate(0, ${height/2})`).call(axisLin);
svg.append("g").attr("transform", `translate(0, ${3*height/4})`).call(axisLinInv); svg
Scalers created with d3.scale*
family of functions play important role in translating the virtual coordiantes into pixel coordinates within the SVG node.
Once the scaler is defined, we can use it to create an axis. Note, that by default, the axis will be placed at the (0, 0)
coordinate of its parent and we have to change this behaviour by adding a new generic element <g>
and move it to the desired place using transform
attribute. Here, we use the height
of the plotting area to place the horizontal axes in equal distances.
To draw the lines, circles, and text we can use the specified SVG
elements such as <line>
, <circle>
.
.append("line")
svg.attr("stroke", "black")
.attr("stroke-width", 3)
.attr("x1", 50).attr("y1", height-50)
.attr("x2", width-50).attr("y2", 50);
var centres = [[50, height-50], [width-50, 50]];
.append("g")
svg.selectAll("circle")
.data(centres).enter()
.append("circle")
.attr("r", 7)
.attr("cx", d => d[0])
.attr("cy", d => d[1])
.attr("stroke", "none")
.attr("fill", "red");
Drawing lines this way is not the most efficient solution and it is better to use the turtle-like language and the d
attribute of the path
element. To do that, we need to keep the (x, y)
coordinates stored as a touple in an array so that it looks as follows: [[x1, y1], [x2, y2], ... , [xN, yN]]
. D3.js provides us with line maker factories which are automatising the like drawing and also let us decide wheter we want to use spline polynomials, connect the dots or maybe use one of the other methods from the library.
var margin = {top: 50, right: 50, bottom: 50, left: 50};
var data = Array.from({length: 7}, (d,i) => [i, Math.random()]);
var xScaler = d3.scaleLinear()
.domain([0, data.length - 1])
.range([margin.left, width - margin.right]);
var yScaler = d3.scaleLinear()
.domain([0, 1])
.range([height - margin.bottom, margin.top]);
= data.map( d => [xScaler(d[0]), yScaler(d[1])]);
data
.append("g")
svg.selectAll("circle")
.data(data).enter()
.append("circle")
.attr("r", 7)
.attr("fill", "black")
.attr("cx", d => d[0])
.attr("cy", d => d[1]);
var lineMaker = d3.line().curve(d3.curveNatural);
.append("g")
svg.append("path")
.attr("d", lineMaker(data))
.attr("fill", "none")
.attr("stroke", "steelblue")
.attr("stroke-width", 4);
So far we discussed how to use scalers, add axes to the plot and how to plot a line. One vital functionality that we have to add to the widged is how to respond to the mouse events such as mouseover
or click
.
var circle = svg.append("circle")
.attr("r", 50)
.attr("cx", width/2)
.attr("cy", height/2)
.attr("fill", "gray")
.on("click", function(event) {
var el = d3.select(this);
var col = el.attr("fill");
.attr("fill", () => col === "gray"?"black":"gray");
el;
})
.append("text")
svg.attr("font-family", "sans-serif")
.attr("font-size", 24)
.attr("text-anchor", "middle")
.attr("x", width/2).attr("y", height/2 + 10)
.attr("fill", "white")
.text("click!");
The last missing element that has been used to build our widget is the drag-and-drop functionality. To add drag behaviour to the application we use the d3.drag()
factory function and specify what actions we want to take when the drag action is started, performed and finished. Then we call the drag factory function with the desired node as a parameter.
var hud = svg.append("text")
.attr("x", 50).attr("y", 50)
.attr("visibility", "hidden");
var dragon = svg.append("g");
var circle = dragon.append("circle")
.attr("r", 10)
.attr("cx", width/2)
.attr("cy", height/2)
.attr("fill", "black");
var pointer = dragon.append("text")
.attr("x", width/2+7)
.attr("y", height/2+37)
.attr("font-size", 48)
.attr("text-anchor", "middle")
.text("👆");
var drag = d3.drag()
.on("start", function(event) {
.attr("visibility", "hidden");
pointer.text("Drag event started...")
hud.attr("visibility", "visible");
}).on("drag", function(event) {
var pos = d3.pointer(event, svg.node());
.attr("cx", pos[0]).attr("cy", pos[1]);
circle.attr("x", pos[0] + 7).attr("y", pos[1] + 37);
pointer
.text("You are dragging the element...");
hud
}).on("end", function() {
.attr("visibility", "visible");
pointer
.text("Drag event stopped...");
hud
.timeout(function() {
d3.attr("visibility", "hidden");
hud, 1000);
}
})
drag(dragon);
D3.js can be used to create interactive documents from within the RStudio environment by using the r2d3
package. This website was created in markdown and the d3js code chunks were added and executed from within. For more information go to r2d3 website.