First of all, let’s start with a bit of vocabulary. CSV stands for Coma Separated Values or Character Separated Values. This means that the data are delimited by comma but also semicolons, tabs, quotes. It is a type of text file. In other words, in a spreadsheet like Excel, you can display information arranged in columns and rows.
In this article, we explain what is the structure of a CSV file and how to parse it in Drop to Kibana.
The first row, or the headers of the csv.
The first row of the CSV file contains the data labels. It’s also called the headers. Therefore, each next rows corresponds to data records available, in our case, the COVID-19 testing sites.
To clarify, commas (or other delimiters) delimit each field, each precise information. For example, you can get various type, such as a number, location or date. To get more on data types, I recommend this video, from 14min42.
Let’s take a closer look with our file:
The first row, called the header, shows the data labels. When you look on the website where you can download the file, it is indicated that the labels correspond to:
id_ej: Finess juridique
finess: Finess géographique
rs: Raison sociale
cpl_loc: Complément localisation
do_prel: Effectue test RT-PCR
do_antigenic: Effectue test antigénique
mod_prel: Modalités de prélèvement
public: Publics accueillis
horaire_prio: Horaire personnes prioritaires
check_rdv: Avec ou sans rendez-vous
tel_rdv: Téléphone prise rendez-vous
web_rdv: Site internet prise rendez-vous
date_modif: Dernière date de mise à jour
We can found all this fields in Kibana, as you’ll discover in the following steps.
Next steps on the raw csv file
In the rest of the file, each row corresponds to a test site. In a nutshell, that means that the 3,272 rows of the file describes the 3,272 test sites. Let’s focus on the first one after the header:
On one row, all the fields are in the same order as the header row, beginning with the identifier to go until the last update date. To get further, we replace this row as if we open the file in a spreadsheet. Which would look like this:
|HlI2rCJ014Dk4X3Z||010001725||010001733||BM CROIX BLANCHE BOURG EN B||1 AV AMEDEE MERCIER 01000 BOURG EN BRESSE||5.24185205182066||46.2038511077026||Sur place||Tout public||lundi : 8h00-12h00 et 14h00-19h00 | mardi : 8h00-12h|
samedi : 8h00-12h00 | dimanche : fermé
|/||Sur rendez-vous uniquement||0474452636||/||2020-09-24|
As you can see, some cells are empty. This means that in our file, nothing is filled for the field. How can we see it? In the text file, 2 commas follow each other. Thus, it indicates that the field is an empty, no value between these two commas.
Actually,Kibana indicates this empty field in an other way. We will see that in the next part.
How looks your CSV file in Kibana?
When we process our raw file into Kibana (tutorials available here to get started ), we specify the “delimiters” used in CSV. In our case, the delimiters or separations are commas. By specifying it, Drop to Kibana can parse the fields and extract correctly the information; Subsequently, the fields as text or character string (string), dates (date), numbers (number), or geolocated coordinates (geo_point) are well categorized.
When our file is in Kibana, we also find the header we talked about a few paragraphs above. You can see it in Discover, and it looks like this.
In Kibana, the number of records is the number of rows in the raw file. For example, in our file on COVID-19 sampling sites, we count:
When we go a bit further with our csv file, we can display every row with field s detail within the Discover Kibana menu.
Each “_sources” corresponds to a row which lists a test site. The difference with the raw file is that Kibana replaces the names of the categories before each value:
It may happen that Kibana displays the value “NULL” for some fields and this is completely normal! Indeed, earlier in this article, I told you about empty fields in the raw file, with two commas following. Therefore, when Kibana processes our file, it puts this famous value “NULL” for the empty field. As a result, this specifies that nothing is indicated in the cell.
Now, you are an expert in CSV files, ready to use it in Kibana and get insights from data.
If you want to learn more about Kibana, I invite you to have a look on our previous articles. For the freshest posts and tips, you can follow us on your favorite social network LinkedIn, Facebook or Twitter.