The siMPle FTIR
data analysis software produces files with many microplastic records.
Reporting of results and optimization experiments may require the
processing of a substantial amount of data files. To make the data
analysis of these result files correct, efficient, transparent and
reproducible, a data analysis script is highly needed for Dutch
microplastic monitoring and data analysis. The R-package
siMPleR that serves these needs is described in this
manual.
The functional requirements and technical specifications are given in
van Loon & Walvoort (2024) and are shipped with the
siMPleR-package.
One of the design goals was to create a simple R-script that should be able to run in a standard R-console of the R Project for Statistical Computing.
If R is not installed on your computer you can download it here for free. R is available for MS-Windows, Linux, and macOS. Follow the instructions on this website to install R on your computer. You should install R version >= 4.4.0.
The siMPleR software requires a limited number of packages to run. On
MS-Windows, these can be installed via the menu option:
Packages | Install package(s)... | Select CRAN mirror | Select packages.
You should select the following packages:
dplyr, readr, purrr, stringr, yaml, tidyr, ggplot2
Perhaps more convenient is to install packages by running the following code in the R-console:
install.packages("dplyr", "readr", "purrr", "stringr", "yaml", "tidyr", "ggplot2")
Copy and paste this line into the R-console and press the return-key. This also works on Linux and MacOS.
Installing packages is only needed once.
The siMPleR-package is provided in a zip file. You
should unzip this file in a directory of choice. This will be your
working directory.
In R you can select this working directory by typing,
e.g.:
setwd("c:/simpler")
in the R-console, where c:/simpler is a
placeholder name for your working directory. As an alternative, you can
also select the menu option File | Change dir... in the R
Console (MS-Windows only).
This working directory now contains the
polymer-colours.yaml-file and the following
subdirectories:
siMPleR-package.The siMPleR-package can be installed by typing:
install.packages("./pkg/siMPleR_1.0.0.tar.gz", repos = NULL, type = "source")
in the R console. As an alternative, the package can also be
installed via the RGui menu (Packages | Install package(s) from local
files…). Installing the siMPleR-package needs to be done
only once.
The siMPle FTIR software processes FTIR instrument files and produces an list of MP identifications. The format of this list is:
These files are included in the input directory. On some computers,
the character set that siMPle FTIR uses gives
weird looking characters in the units of the header in the data files.
The R-package siMPleR can handle these characters.
In the input directory, all the individual siMPle FTIR files are placed.
Each siMPle FTIR file must have the following filename format:
@location_Y1234_Rx_Ey_G12
The metadata are encoded in the filename: @**** gives the location code, Y*** the year, R*
the replicate number, E* the extraction number, and G* the sample mass
analyzed in gram. For example: @NW2_Y2023_R1_E2_G20. The
order of the metadata is irrelevant. Note that the metadata are
connected via underscores. Additional text is allowed in the filename.
However, this text should not start with any of the reserved prefixes
given above (@, Y, R, E, or G).
In each siMPle file, a QC column (with header QC) must be added manually. This QC column may contain the following QC-codes (added by the analyst or by the script):
ppring: if a PP is a part of the Anodisc support ring.
This code is added by the analyst;<50 um: or any other lower length limit used. This
code is added by the script;Duplicate: determined by the script as some
polymer at same coordinates, largest length is chosen;natural: if the second database shows the
identification is a natural particle. This code is added by the
analyst;=: if the second FTIR database gives the same
identification as siMPle. This code is added by the
analyst;plastic: if the second database gives a different MP
identification as siMPle. So the correct polymer identification is
uncertain. This code is added by the analyst.The QC field may often be empty, if no QC has been performed on that record. Additional QC codes are allowed in the siMPle files, but are not processed by the script.
Additional metadata can be added to the siMPle FTIR output for reporting purposes.
An example of the desired input format is given in the input directory’. This directory is provided with the siMPleR software.
Note that siMPle is not restricted to analyzing output files from the siMPle FTIR software. As long as the file formats comply with the specifications given above, the output files of any software will do.
First you have to select your working directory (see Section
“Installing siMPleR”). In the R Console this can be done by means of the
menu option: File | Change dir.... A simple browser will
appear that you can use to navigate to your working directory.
The siMPleR-package can be loaded by typing
library(siMPleR) in the console.
The siMPleR-software can be started by running
simpler() in the console. Two questions will be asked:
The data file will first be checked for possible errors. The following checks will be performed:
Only records that have successfully passed quality control will be used in the data analysis.
The script performs several data analysis steps:
The colours of the graphs may be specified by changing the coloor
codes in polymer-colours.yaml. How to change colour codes
is explained in the header section of this file.
siMPleR produces the following outputs, each with a timestamp to uniquely identify individual runs:
NA;| Group | total | # QCs | # = | # plastic | # natural | % false pos |
|---|---|---|---|---|---|---|
| APU | 7 | 6 | 0 | 6 | 0 | 0 |
| EVA | 3 | 3 | 0 | 3 | 1 | 25 |
| PA | 2 | 2 | 2 | 0 | 0 | 0 |
The following classification rules are applied to construct this table. The siMPle records with the QC-codes ‘ppring’, ‘< 50 um’ and ‘duplicate’ are excluded from the calculation because they fall outside the valid dataset (results are not valid). These records can still be found in the basic-data file for information/QC. A QC action using an external database may result in 3 results:
Van Loon, W. and D. Walvoort. Technical Specifications of the siMPleR script: post-analysis of microplastic data. Version 13, 14-01-2025