Title: | Aids in Processing and Plotting Data from a Lemna-Tec Scananalyzer |
---|---|
Description: | Note that 'imageData' has been superseded by 'growthPheno'. The package 'growthPheno' incorporates all the functionality of 'imageData' and has functionality not available in 'imageData', but some 'imageData' functions have been renamed. The 'imageData' package is no longer maintained, but is retained for legacy purposes. |
Authors: | Chris Brien [aut, cre]
|
Maintainer: | Chris Brien <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1-62 |
Built: | 2025-01-23 05:57:34 UTC |
Source: | https://github.com/cran/imageData |
Note that 'imageData' has been superseded by 'growthPheno'. The package 'growthPheno' incorporates all the functionality of 'imageData' and has functionality not available in 'imageData', but some 'imageData' functions have been renamed. The 'imageData' package is no longer maintained, but is retained for legacy purposes.
Version: 0.1-62
Date: 2023-08-21
For an overview of the use of these functions and an example see below.
(i) Data | |
RiceRaw.dat
|
Data for an experiment to investigate a rice |
germplasm panel. | |
(ii) Data frame manipulation | |
designFactors
|
Adds the factors and covariates for a blocked, |
split-plot design. | |
getDates
|
Forms a subset of 'responses' in 'data' that |
contains their values for the nominated times. | |
importExcel
|
Imports an Excel imaging file and allows some |
renaming of variables. | |
longitudinalPrime
|
Selects a set variables to be retained in a |
data frame of longitudinal data. | |
twoLevelOpcreate
|
Creates a data.frame formed by applying, for |
each response, abinary operation to the values of | |
two different treatments. | |
(iii) Plots | |
anomPlot
|
Identifies anomalous individuals and produces |
longitudinal plots without them and with just them. | |
corrPlot
|
Calculates and plots correlation matrices for a |
set of responses. | |
imagetimesPlot
|
Plots the time within an interval versus the interval. |
For example, the hour of the day carts are imaged | |
against the days after planting (or some other | |
number of days after an event). | |
longiPlot
|
Plots longitudinal data from a Lemna Tec |
Scananalyzer. | |
probeDF
|
Compares, for a set of specified values of df, |
a response and the smooths of it, possibly along | |
with growth rates calculated from the smooths. | |
(iv) Calculations value-by-value | |
GrowthRates
|
Calculates growth rates (AGR, PGR, RGRdiff) |
between pairs of values in a vector. | |
WUI
|
Calculates the Water Use Index (WUI). |
anom
|
Tests if any values in a vector are anomalous |
in being outside specified limits. | |
calcTimes
|
Calculates for a set of times, the time intervals |
after an origin time and the position of each with | |
in that time. | |
calcLagged
|
Replaces the values in a vector with the result |
of applying an operation to it and a lagged value. | |
cumulate
|
Calculates the cumulative sum, ignoring the |
first element if exclude.1st is TRUE. | |
(v) Calculations over multiple values | |
fitSpline
|
Produce the fits from a natural cubic smoothing |
spline applied to a response in a 'data.frame'. | |
intervalGRaverage
|
Calculates the growth rates for a specified |
time interval by taking weighted averages of | |
growth rates for times within the interval. | |
intervalGRdiff
|
Calculates the growth rates for a specified |
time interval. | |
intervalValueCalculate
|
Calculates a single value that is a function of |
an individual's values for a response over a | |
specified time interval. | |
intervalWUI
|
Calculates water use indices (WUI) over a |
specified time interval to a data.frame. | |
(vi) Caclulations in each split of a 'data.frame' | |
splitContGRdiff
|
Adds the growth rates calculated continuously |
over time for subsets of a response to a | |
'data.frame'. | |
splitSplines
|
Adds the fits after fitting a natural cubic |
smoothing spline to subsets of a response to a | |
'data.frame'. | |
splitValueCalculate
|
Calculates a single value that is a function of |
an individual's values for a response. | |
(vii) Principal variates analysis (PV A) | |
intervalPVA
|
Selects a subset of variables observed within a |
specified time interval using PVA. | |
PVA
|
Selects a subset of variables using PVA. |
rcontrib
|
Computes a measure of how correlated each |
variable in a set is with the other variable, | |
conditional on a nominated subset of them. | |
This package can be used to carry out a full seven-step process to produce phenotypic traits from measurements made in a high-throughput phenotyping facility, such as one based on a Lemna-Tec Scananalyzer 3D system and described by Al-Tamimi et al. (2016). Otherwise, individual functions can be used to carry out parts of the process.
The basic data consists of imaging data obtained from a set of pots or carts over time. The carts are arranged in a grid of Lanes Positions. There should be a unique identifier for each cart, which by default is
Snapshot.ID.Tag
, and variable giving the Days after Planting for each measurement, by default Time.after.Planting..d.
. In some cases, it is expected that there will be a column labelled Snapshot.Time.Stamp
, which reflects the time of the imaging from which a particular data value was obtained.
The full seven-step process is as follows:
Use importExcel
to import the raw data from the Excel file. This step should also involve any editing of the data needed to take account of mishaps during the data collection and the need to remove faulty data (produces raw.dat
). Generally, data can be removed by replacing only values for responses with missing values (NA
) for carts whose data is to be removed, leaving the identifying information intact.
Use longitudinalPrime
to select a subset of the imaging variables produced by the Lemna Tec Scanalyzer and, if the design is a blocked, split-plot design, use designFactors
to add covariates and factors that might be used in the analysis (produces the data frame longi.prime.dat
).
Add derived traits that result in a value for each observation: use splitContGRdiff
to obtain continuous growth rates i.e. a growth rate for each time of observation, except the first; WUI
to produce continuous Water Use Efficiency Indices (WUE) and cumulate
to produce cumulative responses. (Produces the data frame longi.dat
.)
Use splitSplines
to fit splines to smooth the longitudinal trends in the primary traits and calculate continuous growth rates from the smoothed response (added to the data frame longi.dat
). There are two options for calculating continuous smoothed growth rates: (i) by differencing — use splitContGRdiff
; (ii) from the first derivatives of the splines — in splitSplines
include 1
in the deriv
argument, include "AGR"
in suffices.deriv
and set the RGR
to say "RGR"
. Optionally, use probeDF
to compare the smooths for a number of values of df
and, if necessary, re-run splitSplines
with a revised value of df
.
Perform an exploratory examination of the unsmoothed data by using longiPlot
to produce longitudinal plots of unsmoothed imaging traits and continuous growth rates. Also, use longiPlot
to plot the smoothed imaging traits and continuous growth rates and anomPlot
to check for anomalies in the data.
Produce cart data: traits for which there is a single value for each Snapshot.ID.Tag
or cart. (produces the data frame cart.dat
)
Set up a cart data.frame with the factors and covariates for a single observation from all carts. This can be done by subsetting longi.dat
so that there is one entry for each cart.
Use getDates
to add traits at specific times to the cart data.frame
, often the first and last day of imaging for each Snapshot.ID.Tag
. The times need to be selected so that there is one and only one observation for each cart. Also form traits, such as growth rates over the whole imaging period, based on these values
Based on the longitudinal plots, decide on the intervals for which growth rates and WUEs are to be calculated. The growth rates for intervals are calculated from the continuous growth rates, using intervalGRdiff
, if the continuous growth rates were calculated by differencing, or intervalGRaverage
, if they were calculated from first derivatives. To calculate WUEs for intervals, use intervalWUI
, The interval growth rates and WUEs are added to the cart data.frame
.
(Optional) There is also the possibility that, for experiments investigating salinity, the Shoot Ion Independent Tolerance (SIIT) index can be calculated using twoLevelOpcreate
.
Chris Brien [aut, cre] (<https://orcid.org/0000-0003-0581-1817>)
Maintainer: Chris Brien <[email protected]>
Al-Tamimi, N, Brien, C.J., Oakey, H., Berger, B., Saade, S., Ho, Y. S., Schmockel, S. M., Tester, M. and Negrao, S. (2016) New salinity tolerance loci revealed in rice using high-throughput non-invasive phenotyping. Nature Communications, 7, 13342.
## Not run: ### This example can be run because the data.frame RiceRaw.dat is available with the package #'# Step 1: Import the raw data data(RiceRaw.dat) #'# Step 2: Select imaging variables and add covariates and factors (produces longi.dat) longi.dat <- longitudinalPrime(data=RiceRaw.dat, smarthouse.lev=c("NE","NW")) longi.dat <- designFactors(longi.dat, insertName = "xDays", designfactorMethod="StandardOrder") #'## Particular edits to longi.dat longi.dat <- within(longi.dat, { Days.after.Salting <- as.numfac(Days) - 29 }) longi.dat <- with(longi.dat, longi.dat[order(Snapshot.ID.Tag,Days), ]) #'# Step 3: Form derived traits that result in a value for each observation #'### Set responses responses.image <- c("Area") responses.smooth <- paste(responses.image, "smooth", sep=".") #'## Form growth rates for each observation of a subset of responses by differencing longi.dat <- splitContGRdiff(longi.dat, responses.image, INDICES="Snapshot.ID.Tag", which.rates = c("AGR","RGR")) #'## Form Area.WUE longi.dat <- within(longi.dat, { Area.WUE <- WUI(Area.AGR*Days.diffs, Water.Loss) }) #'## Add cumulative responses longi.dat <- within(longi.dat, { Water.Loss.Cum <- unlist(by(Water.Loss, Snapshot.ID.Tag, cumulate, exclude.1st=TRUE)) WUE.cum <- Area / Water.Loss.Cum }) #'# Step 4: Fit splines to smooth the longitudinal trends in the primary traits and #'# calculate their growth rates #' #'## Smooth responses #+ for (response in c(responses.image, "Water.Loss")) longi.dat <- splitSplines(longi.dat, response, x="xDays", INDICES = "Snapshot.ID.Tag", df = 4, na.rm=TRUE) longi.dat <- with(longi.dat, longi.dat[order(Snapshot.ID.Tag, xDays), ]) #'## Loop over smoothed responses, forming growth rates by differences #+ responses.GR <- paste(responses.smooth, "AGR", sep=".") longi.dat <- splitContGRdiff(longi.dat, responses.smooth, INDICES="Snapshot.ID.Tag", which.rates = c("AGR","RGR")) #'## Finalize longi.dat longi.dat <- with(longi.dat, longi.dat[order(Snapshot.ID.Tag, xDays), ]) #'# Step 5: Do exploratory plots on unsmoothed and smoothed longitudinal data responses.longi <- c("Area","Area.AGR","Area.RGR", "Area.WUE") responses.smooth.plot <- c("Area.smooth","Area.smooth.AGR","Area.smooth.RGR") titles <- c("Total area (1000 pixels)", "Total area AGR (1000 pixels per day)", "Total area RGR (per day)", "Total area WUE (1000 pixels per mL)") titles.smooth<-titles nresp <- length(responses.longi) limits <- list(c(0,1000), c(-50,125), c(-0.05,0.40), c(0,30)) #' ### Plot unsmoothed profiles for all longitudinal responses #+ "01-ProfilesAll" klimit <- 0 for (k in 1:nresp) { klimit <- klimit + 1 longiPlot(data = longi.dat, response = responses.longi[k], y.title = titles[k], x="xDays+35.42857143", ggplotFuncs = list(geom_vline(xintercept=29, linetype="longdash", size=1), scale_x_continuous(breaks=seq(28, 42, by=2)), scale_y_continuous(limits=limits[[klimit]]))) } #' ### Plot smoothed profiles for all longitudinal responses - GRs by difference #+ "01-SmoothedProfilesAll" nresp.smooth <- length(responses.smooth.plot) limits <- list(c(0,1000), c(0,100), c(0.0,0.40)) for (k in 1:nresp.smooth) { longiPlot(data = longi.dat, response = responses.smooth.plot[k], y.title = titles.smooth[k], x="xDays+35.42857143", ggplotFuncs = list(geom_vline(xintercept=29, linetype="longdash", size=1), scale_x_continuous(breaks=seq(28, 42, by=2)), scale_y_continuous(limits=limits[[klimit]]))) print(plt) } #'### AGR anomalies - plot without anomalous plants followed by plot of anomalous plants #+ "01-0254-AGRanomalies" anom.ID <- vector(mode = "character", length = 0L) response <- "Area.smooth.AGR" cols.output <- c("Snapshot.ID.Tag", "Smarthouse", "Lane", "Position", "Treatment.1", "Genotype.ID", "Days") anomalous <- anomPlot(longi.dat, response=response, lower=2.5, start.time=40, x = "xDays+35.42857143", vertical.line=29, breaks=seq(28, 42, by=2), whichPrint=c("innerPlot"), y.title=response) subs <- subset(anomalous$data, Area.smooth.AGR.anom & Days==42) if (nrow(subs) == 0) { cat("\n#### No anomalous data here\n\n") } else { subs <- subs[order(subs["Smarthouse"],subs["Treatment.1"], subs[response]),] print(subs[c(cols.output, response)]) anom.ID <- unique(c(anom.ID, subs$Snapshot.ID.Tag)) outerPlot <- anomalous$outerPlot + geom_text(data=subs, aes_string(x = "xDays+35.42857143", y = response, label="Snapshot.ID.Tag"), size=3, hjust=0.7, vjust=0.5) print(outerPlot) } #'# Step 6: Form single-value plant responses in Snapshot.ID.Tag order. #' #'## 6a) Set up a data frame with factors only #+ cart.dat <- longi.dat[longi.dat$Days == 31, c("Smarthouse","Lane","Position","Snapshot.ID.Tag", "xPosn","xMainPosn", "Zones","xZones","SHZones","ZLane","ZMainplots", "Subplots", "Genotype.ID","Treatment.1")] cart.dat <- cart.dat[do.call(order, cart.dat), ] #'## 6b) Get responses based on first and last date. #' #'### Observation for first and last date cart.dat <- cbind(cart.dat, getDates(responses.image, data = longi.dat, which.times = c(31), suffix = "first")) cart.dat <- cbind(cart.dat, getDates(responses.image, data = longi.dat, which.times = c(42), suffix = "last")) cart.dat <- cbind(cart.dat, getDates(c("WUE.cum"), data = longi.dat, which.times = c(42), suffix = "last")) responses.smooth <- paste(responses.image, "smooth", sep=".") cart.dat <- cbind(cart.dat, getDates(responses.smooth, data = longi.dat, which.times = c(31), suffix = "first")) cart.dat <- cbind(cart.dat, getDates(responses.smooth, data = longi.dat, which.times = c(42), suffix = "last")) #'### Growth rates over whole period. #+ tottime <- 42 - 31 cart.dat <- within(cart.dat, { Area.AGR <- (Area.last - Area.first)/tottime Area.RGR <- log(Area.last / Area.first)/tottime }) #'### Calculate water index over whole period cart.dat <- merge(cart.dat, intervalWUI("Area", water.use = "Water.Loss", start.times = c(31), end.times = c(42), suffix = NULL, data = longi.dat, include.total.water = TRUE), by = c("Snapshot.ID.Tag")) names(cart.dat)[match(c("Area.WUI","Water.Loss.Total"),names(cart.dat))] <- c("Area.Overall.WUE", "Water.Loss.Overall") cart.dat$Water.Loss.rate.Overall <- cart.dat$Water.Loss.Overall / (42 - 31) #'## 6c) Add growth rates and water indices for intervals #'### Set up intervals #+ start.days <- list(31,35,31,38) end.days <- list(35,38,38,42) suffices <- list("31to35","35to38","31to38","38to42") #'### Rates for specific intervals from the smoothed data by differencing #+ for (r in responses.smooth) { for (k in 1:length(suffices)) { cart.dat <- merge(cart.dat, intervalGRdiff(r, which.rates = c("AGR","RGR"), start.times = start.days[k][[1]], end.times = end.days[k][[1]], suffix.interval = suffices[k][[1]], data = longi.dat), by = "Snapshot.ID.Tag") } } #'### Water indices for specific intervals from the unsmoothed and smoothed data #+ for (k in 1:length(suffices)) { cart.dat <- merge(cart.dat, intervalWUI("Area", water.use = "Water.Loss", start.times = start.days[k][[1]], end.times = end.days[k][[1]], suffix = suffices[k][[1]], data = longi.dat, include.total.water = TRUE), by = "Snapshot.ID.Tag") names(cart.dat)[match(paste("Area.WUI", suffices[k][[1]], sep="."), names(cart.dat))] <- paste("Area.WUE", suffices[k][[1]], sep=".") cart.dat[paste("Water.Loss.rate", suffices[k][[1]], sep=".")] <- cart.dat[[paste("Water.Loss.Total", suffices[k][[1]], sep=".")]] / ( end.days[k][[1]] - start.days[k][[1]]) } cart.dat <- with(cart.dat, cart.dat[order(Snapshot.ID.Tag), ]) #'# Step 7: Form continuous and interval SIITs #' #'## 7a) Calculate continuous #+ cols.retained <- c("Snapshot.ID.Tag","Smarthouse","Lane","Position", "Days","Snapshot.Time.Stamp", "Hour", "xDays", "Zones","xZones","SHZones","ZLane","ZMainplots", "xMainPosn", "Genotype.ID") responses.GR <- c("Area.smooth.AGR","Area.smooth.AGR","Area.smooth.RGR") suffices.results <- c("diff", "SIIT", "SIIT") responses.SIIT <- unlist(Map(paste, responses.GR, suffices.results,sep=".")) longi.SIIT.dat <- twoLevelOpcreate(responses.GR, longi.dat, suffices.treatment=c("C","S"), operations = c("-", "/", "/"), suffices.results = suffices.results, columns.retained = cols.retained, by = c("Smarthouse","Zones","ZMainplots","Days")) longi.SIIT.dat <- with(longi.SIIT.dat, longi.SIIT.dat[order(Smarthouse,Zones,ZMainplots,Days),]) #' ### Plot SIIT profiles #' #+ "03-SIITProfiles" k <- 2 nresp <- length(responses.SIIT) limits <- with(longi.SIIT.dat, list(c(min(Area.smooth.AGR.diff, na.rm=TRUE), max(Area.smooth.AGR.diff, na.rm=TRUE)), c(0,3), c(0,1.5))) #Plots for (k in 1:nresp) { longiPlot(data = longi.SIIT.dat, x="xDays+35.42857143", response = responses.SIIT[k], y.title=responses.SIIT[k], facet.x="Smarthouse", facet.y=".", ggplotFuncs = list(geom_vline(xintercept=29, linetype="longdash", size=1), scale_x_continuous(breaks=seq(28, 42, by=2)), scale_y_continuous(limits=limits[[klimit]]))) } #'## 7b) Calculate interval SIITs and check for large values for SIIT for Days 31to35 #+ "01-SIITIntClean" suffices <- list("31to35","35to38","31to38","38to42") response <- "Area.smooth.RGR.31to35" SIIT <- paste(response, "SIIT", sep=".") responses.SIITinterval <- as.vector(outer("Area.smooth.RGR", suffices, paste, sep=".")) cart.SIIT.dat <- twoLevelOpcreate(responses.SIITinterval, cart.dat, suffices.treatment=c("C","S"), suffices.results="SIIT", columns.suffixed="Snapshot.ID.Tag") tmp<-na.omit(cart.SIIT.dat) print(summary(tmp[SIIT])) big.SIIT <- with(tmp, tmp[tmp[SIIT] > 1.15, c("Snapshot.ID.Tag.C","Genotype.ID", paste(response,"C",sep="."), paste(response,"S",sep="."), SIIT)]) big.SIIT <- big.SIIT[order(big.SIIT[SIIT]),] print(big.SIIT) plt <- ggplot(tmp, aes_string(SIIT)) + geom_histogram(aes(y = ..density..), binwidth=0.05) + geom_vline(xintercept=1.15, linetype="longdash", size=1) + theme_bw() + facet_grid(Smarthouse ~.) print(plt) plt <- ggplot(tmp, aes_string(x="Smarthouse", y=SIIT)) + geom_boxplot() + theme_bw() print(plt) remove(tmp) ## End(Not run)
## Not run: ### This example can be run because the data.frame RiceRaw.dat is available with the package #'# Step 1: Import the raw data data(RiceRaw.dat) #'# Step 2: Select imaging variables and add covariates and factors (produces longi.dat) longi.dat <- longitudinalPrime(data=RiceRaw.dat, smarthouse.lev=c("NE","NW")) longi.dat <- designFactors(longi.dat, insertName = "xDays", designfactorMethod="StandardOrder") #'## Particular edits to longi.dat longi.dat <- within(longi.dat, { Days.after.Salting <- as.numfac(Days) - 29 }) longi.dat <- with(longi.dat, longi.dat[order(Snapshot.ID.Tag,Days), ]) #'# Step 3: Form derived traits that result in a value for each observation #'### Set responses responses.image <- c("Area") responses.smooth <- paste(responses.image, "smooth", sep=".") #'## Form growth rates for each observation of a subset of responses by differencing longi.dat <- splitContGRdiff(longi.dat, responses.image, INDICES="Snapshot.ID.Tag", which.rates = c("AGR","RGR")) #'## Form Area.WUE longi.dat <- within(longi.dat, { Area.WUE <- WUI(Area.AGR*Days.diffs, Water.Loss) }) #'## Add cumulative responses longi.dat <- within(longi.dat, { Water.Loss.Cum <- unlist(by(Water.Loss, Snapshot.ID.Tag, cumulate, exclude.1st=TRUE)) WUE.cum <- Area / Water.Loss.Cum }) #'# Step 4: Fit splines to smooth the longitudinal trends in the primary traits and #'# calculate their growth rates #' #'## Smooth responses #+ for (response in c(responses.image, "Water.Loss")) longi.dat <- splitSplines(longi.dat, response, x="xDays", INDICES = "Snapshot.ID.Tag", df = 4, na.rm=TRUE) longi.dat <- with(longi.dat, longi.dat[order(Snapshot.ID.Tag, xDays), ]) #'## Loop over smoothed responses, forming growth rates by differences #+ responses.GR <- paste(responses.smooth, "AGR", sep=".") longi.dat <- splitContGRdiff(longi.dat, responses.smooth, INDICES="Snapshot.ID.Tag", which.rates = c("AGR","RGR")) #'## Finalize longi.dat longi.dat <- with(longi.dat, longi.dat[order(Snapshot.ID.Tag, xDays), ]) #'# Step 5: Do exploratory plots on unsmoothed and smoothed longitudinal data responses.longi <- c("Area","Area.AGR","Area.RGR", "Area.WUE") responses.smooth.plot <- c("Area.smooth","Area.smooth.AGR","Area.smooth.RGR") titles <- c("Total area (1000 pixels)", "Total area AGR (1000 pixels per day)", "Total area RGR (per day)", "Total area WUE (1000 pixels per mL)") titles.smooth<-titles nresp <- length(responses.longi) limits <- list(c(0,1000), c(-50,125), c(-0.05,0.40), c(0,30)) #' ### Plot unsmoothed profiles for all longitudinal responses #+ "01-ProfilesAll" klimit <- 0 for (k in 1:nresp) { klimit <- klimit + 1 longiPlot(data = longi.dat, response = responses.longi[k], y.title = titles[k], x="xDays+35.42857143", ggplotFuncs = list(geom_vline(xintercept=29, linetype="longdash", size=1), scale_x_continuous(breaks=seq(28, 42, by=2)), scale_y_continuous(limits=limits[[klimit]]))) } #' ### Plot smoothed profiles for all longitudinal responses - GRs by difference #+ "01-SmoothedProfilesAll" nresp.smooth <- length(responses.smooth.plot) limits <- list(c(0,1000), c(0,100), c(0.0,0.40)) for (k in 1:nresp.smooth) { longiPlot(data = longi.dat, response = responses.smooth.plot[k], y.title = titles.smooth[k], x="xDays+35.42857143", ggplotFuncs = list(geom_vline(xintercept=29, linetype="longdash", size=1), scale_x_continuous(breaks=seq(28, 42, by=2)), scale_y_continuous(limits=limits[[klimit]]))) print(plt) } #'### AGR anomalies - plot without anomalous plants followed by plot of anomalous plants #+ "01-0254-AGRanomalies" anom.ID <- vector(mode = "character", length = 0L) response <- "Area.smooth.AGR" cols.output <- c("Snapshot.ID.Tag", "Smarthouse", "Lane", "Position", "Treatment.1", "Genotype.ID", "Days") anomalous <- anomPlot(longi.dat, response=response, lower=2.5, start.time=40, x = "xDays+35.42857143", vertical.line=29, breaks=seq(28, 42, by=2), whichPrint=c("innerPlot"), y.title=response) subs <- subset(anomalous$data, Area.smooth.AGR.anom & Days==42) if (nrow(subs) == 0) { cat("\n#### No anomalous data here\n\n") } else { subs <- subs[order(subs["Smarthouse"],subs["Treatment.1"], subs[response]),] print(subs[c(cols.output, response)]) anom.ID <- unique(c(anom.ID, subs$Snapshot.ID.Tag)) outerPlot <- anomalous$outerPlot + geom_text(data=subs, aes_string(x = "xDays+35.42857143", y = response, label="Snapshot.ID.Tag"), size=3, hjust=0.7, vjust=0.5) print(outerPlot) } #'# Step 6: Form single-value plant responses in Snapshot.ID.Tag order. #' #'## 6a) Set up a data frame with factors only #+ cart.dat <- longi.dat[longi.dat$Days == 31, c("Smarthouse","Lane","Position","Snapshot.ID.Tag", "xPosn","xMainPosn", "Zones","xZones","SHZones","ZLane","ZMainplots", "Subplots", "Genotype.ID","Treatment.1")] cart.dat <- cart.dat[do.call(order, cart.dat), ] #'## 6b) Get responses based on first and last date. #' #'### Observation for first and last date cart.dat <- cbind(cart.dat, getDates(responses.image, data = longi.dat, which.times = c(31), suffix = "first")) cart.dat <- cbind(cart.dat, getDates(responses.image, data = longi.dat, which.times = c(42), suffix = "last")) cart.dat <- cbind(cart.dat, getDates(c("WUE.cum"), data = longi.dat, which.times = c(42), suffix = "last")) responses.smooth <- paste(responses.image, "smooth", sep=".") cart.dat <- cbind(cart.dat, getDates(responses.smooth, data = longi.dat, which.times = c(31), suffix = "first")) cart.dat <- cbind(cart.dat, getDates(responses.smooth, data = longi.dat, which.times = c(42), suffix = "last")) #'### Growth rates over whole period. #+ tottime <- 42 - 31 cart.dat <- within(cart.dat, { Area.AGR <- (Area.last - Area.first)/tottime Area.RGR <- log(Area.last / Area.first)/tottime }) #'### Calculate water index over whole period cart.dat <- merge(cart.dat, intervalWUI("Area", water.use = "Water.Loss", start.times = c(31), end.times = c(42), suffix = NULL, data = longi.dat, include.total.water = TRUE), by = c("Snapshot.ID.Tag")) names(cart.dat)[match(c("Area.WUI","Water.Loss.Total"),names(cart.dat))] <- c("Area.Overall.WUE", "Water.Loss.Overall") cart.dat$Water.Loss.rate.Overall <- cart.dat$Water.Loss.Overall / (42 - 31) #'## 6c) Add growth rates and water indices for intervals #'### Set up intervals #+ start.days <- list(31,35,31,38) end.days <- list(35,38,38,42) suffices <- list("31to35","35to38","31to38","38to42") #'### Rates for specific intervals from the smoothed data by differencing #+ for (r in responses.smooth) { for (k in 1:length(suffices)) { cart.dat <- merge(cart.dat, intervalGRdiff(r, which.rates = c("AGR","RGR"), start.times = start.days[k][[1]], end.times = end.days[k][[1]], suffix.interval = suffices[k][[1]], data = longi.dat), by = "Snapshot.ID.Tag") } } #'### Water indices for specific intervals from the unsmoothed and smoothed data #+ for (k in 1:length(suffices)) { cart.dat <- merge(cart.dat, intervalWUI("Area", water.use = "Water.Loss", start.times = start.days[k][[1]], end.times = end.days[k][[1]], suffix = suffices[k][[1]], data = longi.dat, include.total.water = TRUE), by = "Snapshot.ID.Tag") names(cart.dat)[match(paste("Area.WUI", suffices[k][[1]], sep="."), names(cart.dat))] <- paste("Area.WUE", suffices[k][[1]], sep=".") cart.dat[paste("Water.Loss.rate", suffices[k][[1]], sep=".")] <- cart.dat[[paste("Water.Loss.Total", suffices[k][[1]], sep=".")]] / ( end.days[k][[1]] - start.days[k][[1]]) } cart.dat <- with(cart.dat, cart.dat[order(Snapshot.ID.Tag), ]) #'# Step 7: Form continuous and interval SIITs #' #'## 7a) Calculate continuous #+ cols.retained <- c("Snapshot.ID.Tag","Smarthouse","Lane","Position", "Days","Snapshot.Time.Stamp", "Hour", "xDays", "Zones","xZones","SHZones","ZLane","ZMainplots", "xMainPosn", "Genotype.ID") responses.GR <- c("Area.smooth.AGR","Area.smooth.AGR","Area.smooth.RGR") suffices.results <- c("diff", "SIIT", "SIIT") responses.SIIT <- unlist(Map(paste, responses.GR, suffices.results,sep=".")) longi.SIIT.dat <- twoLevelOpcreate(responses.GR, longi.dat, suffices.treatment=c("C","S"), operations = c("-", "/", "/"), suffices.results = suffices.results, columns.retained = cols.retained, by = c("Smarthouse","Zones","ZMainplots","Days")) longi.SIIT.dat <- with(longi.SIIT.dat, longi.SIIT.dat[order(Smarthouse,Zones,ZMainplots,Days),]) #' ### Plot SIIT profiles #' #+ "03-SIITProfiles" k <- 2 nresp <- length(responses.SIIT) limits <- with(longi.SIIT.dat, list(c(min(Area.smooth.AGR.diff, na.rm=TRUE), max(Area.smooth.AGR.diff, na.rm=TRUE)), c(0,3), c(0,1.5))) #Plots for (k in 1:nresp) { longiPlot(data = longi.SIIT.dat, x="xDays+35.42857143", response = responses.SIIT[k], y.title=responses.SIIT[k], facet.x="Smarthouse", facet.y=".", ggplotFuncs = list(geom_vline(xintercept=29, linetype="longdash", size=1), scale_x_continuous(breaks=seq(28, 42, by=2)), scale_y_continuous(limits=limits[[klimit]]))) } #'## 7b) Calculate interval SIITs and check for large values for SIIT for Days 31to35 #+ "01-SIITIntClean" suffices <- list("31to35","35to38","31to38","38to42") response <- "Area.smooth.RGR.31to35" SIIT <- paste(response, "SIIT", sep=".") responses.SIITinterval <- as.vector(outer("Area.smooth.RGR", suffices, paste, sep=".")) cart.SIIT.dat <- twoLevelOpcreate(responses.SIITinterval, cart.dat, suffices.treatment=c("C","S"), suffices.results="SIIT", columns.suffixed="Snapshot.ID.Tag") tmp<-na.omit(cart.SIIT.dat) print(summary(tmp[SIIT])) big.SIIT <- with(tmp, tmp[tmp[SIIT] > 1.15, c("Snapshot.ID.Tag.C","Genotype.ID", paste(response,"C",sep="."), paste(response,"S",sep="."), SIIT)]) big.SIIT <- big.SIIT[order(big.SIIT[SIIT]),] print(big.SIIT) plt <- ggplot(tmp, aes_string(SIIT)) + geom_histogram(aes(y = ..density..), binwidth=0.05) + geom_vline(xintercept=1.15, linetype="longdash", size=1) + theme_bw() + facet_grid(Smarthouse ~.) print(plt) plt <- ggplot(tmp, aes_string(x="Smarthouse", y=SIIT)) + geom_boxplot() + theme_bw() print(plt) remove(tmp) ## End(Not run)
Test whether any values in x
are less than the value of
lower
, if it is not NULL
, or are greater than the
value of upper
, if it is not NULL
, or both.
anom(x, lower=NULL, upper=NULL, na.rm = TRUE)
anom(x, lower=NULL, upper=NULL, na.rm = TRUE)
x |
A |
lower |
A |
upper |
A |
na.rm |
A |
A logical
indicating whether any values have been found to
be outside the limits specified by lower
or upper
or both.
Chris Brien
data(exampleData) anom.val <- anom(longi.dat$Area.smooth.AGR, lower=2.5)
data(exampleData) anom.val <- anom(longi.dat$Area.smooth.AGR, lower=2.5)
Uses intervalValueCalculate
and the function
anom
to identify anomalous individuals. The user can
elect to print the anomalous individuals, a longitudinal profile plot
without the anomalous individuals and/or a longitudinal profile plot
with only the anomalous individuals. The plots are produced using
ggplot
. The plot can be facetted so that a grid of plots is
produced.
anomPlot(data, x="xDays+24.16666667", response="Area.smooth.RGR", individuals="Snapshot.ID.Tag", breaks=seq(12, 36, by=2), vertical.line=NULL, groupsFactor=NULL, lower=NULL, upper=NULL, start.time=NULL, end.time=NULL, times.factor = "Days", suffix.interval=NULL, columns.retained=c("Snapshot.ID.Tag", "Smarthouse", "Lane", "Position", "Treatment.1", "Genotype.ID"), whichPrint=c("anomalous","innerPlot","outerPlot"), na.rm=TRUE, ...)
anomPlot(data, x="xDays+24.16666667", response="Area.smooth.RGR", individuals="Snapshot.ID.Tag", breaks=seq(12, 36, by=2), vertical.line=NULL, groupsFactor=NULL, lower=NULL, upper=NULL, start.time=NULL, end.time=NULL, times.factor = "Days", suffix.interval=NULL, columns.retained=c("Snapshot.ID.Tag", "Smarthouse", "Lane", "Position", "Treatment.1", "Genotype.ID"), whichPrint=c("anomalous","innerPlot","outerPlot"), na.rm=TRUE, ...)
data |
A |
x |
A |
response |
A |
individuals |
A |
breaks |
A |
vertical.line |
A |
groupsFactor |
A |
lower |
A |
upper |
A |
start.time |
A |
end.time |
A |
times.factor |
A |
suffix.interval |
A |
columns.retained |
A |
whichPrint |
A |
na.rm |
A |
... |
allows for arguments to be passed to |
A list
with three components:
data
, a data frame resulting from the merge
of data
and the logical
identifying whether
or not an individual is anomalous;
innerPlot
, an object of class ggplot
storing the
longitudinal plot of the individuals that are not anomalous;
outerPlot
, an object of class ggplot
storing the
longitudinal plot of only the individuals that are anomalous.
The name of the column indicating anomalous individuals will be result of
concatenating the response
, anom
and, if it is not
NULL
, suffix.interval
, each separated by a full stop.
The ggplot
objects can be plotted using print
and can be
modified by adding ggplot
functions before printing. If there are
no observations to plot, NULL
will be returned for the plot.
Chris Brien
anom
, intervalValueCalculate
, ggplot
.
data(exampleData) anomalous <- anomPlot(longi.dat, response="Area.smooth.AGR", lower=2.5, start.time=40, x = "xDays+35.42857143", vertical.line=29, breaks=seq(28, 42, by=2), whichPrint=c("innerPlot"), y.title="Area.smooth.AGR")
data(exampleData) anomalous <- anomPlot(longi.dat, response="Area.smooth.AGR", lower=2.5, start.time=40, x = "xDays+35.42857143", vertical.line=29, breaks=seq(28, 42, by=2), whichPrint=c("innerPlot"), y.title="Area.smooth.AGR")
Replaces the values in x
with the result of applying an
operation
to it and the value that is lag
positions
either before it or after it in x
, depending on whether
lag
is positive or negative. For positive lag
the first lag
values will be NA
, while for negative
lag
the last lag
values will be NA
.
When operation
is NULL
, the values are moved
lag
positions down the vector.
calcLagged(x, operation = NULL, lag = 1)
calcLagged(x, operation = NULL, lag = 1)
x |
A |
operation |
A |
lag |
A |
A vector
containing the result of applying operation
to
values in x
. For positive lag
the first lag
values will
be NA
, while for negative lag
the last lag
values will be NA
.
Chris Brien
data(exampleData) longi.dat$Days.diffs <- calcLagged(longi.dat$xDays, operation ="-")
data(exampleData) longi.dat$Days.diffs <- calcLagged(longi.dat$xDays, operation ="-")
For the column specified in imageTimes, having converted it to POSIXct
if not already converted, calculates for each value the number of intervalUnits
between the time and the startTime
. Then the number of timePositions
within the intervals
is calculated for the values in imageTimes
. The function difftimes
is used in doing the calculations, but the results are converted to numeric
. For example intervals
could correspond to the number of Days after Planting and the timePositions
to the hour within each day.
calcTimes(data, imageTimes = NULL, timeFormat = "%Y-%m-%d %H:%M", intervals = "Time.after.Planting..d.", startTime = NULL, intervalUnit = "days", timePositions = NULL)
calcTimes(data, imageTimes = NULL, timeFormat = "%Y-%m-%d %H:%M", intervals = "Time.after.Planting..d.", startTime = NULL, intervalUnit = "days", timePositions = NULL)
data |
A |
imageTimes |
A |
timeFormat |
A |
intervals |
A |
startTime |
A |
intervalUnit |
A |
timePositions |
A |
A data.frame
, being the unchaged data data.frame
when
imageTimes
is NULL
or containing
either intervals and/or timePositions depending on which is not NULL
.
Chris Brien
data(exampleData) raw.dat <- calcTimes(data = raw.dat, imageTimes = "Snapshot.Time.Stamp", timePositions = "Hour")
data(exampleData) raw.dat <- calcTimes(data = raw.dat, imageTimes = "Snapshot.Time.Stamp", timePositions = "Hour")
Having calculated the correlations a heat map indicating the magnitude of the
correlations is produced using ggplot
. In this heat map, the darker the red in
a cell then the closer the correlation is to -1, while the deeper the blue in the cell,
then the closer the correlation is to 1. Also produced is a matrix plot of all pairwise
combinations of the variables. The matrix plot contains a scatter diagram for each pair,
as well as the value of the correlation coefficient. The argument pairs.sets
can be used to restrict the pairs in the matrix plot to those combinations within each
set.
corrPlot(responses, data, which.plots = c("heatmap","matrixplot"), title = NULL, labels = NULL, labelSize = 4, show.sig = FALSE, pairs.sets = NULL, ...)
corrPlot(responses, data, which.plots = c("heatmap","matrixplot"), title = NULL, labels = NULL, labelSize = 4, show.sig = FALSE, pairs.sets = NULL, ...)
responses |
A |
data |
A |
which.plots |
A |
title |
Title for the plots. |
labels |
A |
labelSize |
A |
show.sig |
A |
pairs.sets |
A |
... |
allows passing of arguments to other functions |
NULL
.
Chris Brien
## Not run: data(exampleData) responses <- c("Area","Area.SV","Area.TV", "Image.Biomass", "Max.Height","Centre.Mass", "Density", "Compactness.TV", "Compactness.SV") corrPlot(responses, longi.dat, pairs.sets=list(c(1:4),c(5:7))) ## End(Not run)
## Not run: data(exampleData) responses <- c("Area","Area.SV","Area.TV", "Image.Biomass", "Max.Height","Centre.Mass", "Density", "Compactness.TV", "Compactness.SV") corrPlot(responses, longi.dat, pairs.sets=list(c(1:4),c(5:7))) ## End(Not run)
Uses cumsum
to calculate the cumulative sum, ignoring the first element
if exclude.1st is TRUE
.
cumulate(x, exclude.1st = FALSE)
cumulate(x, exclude.1st = FALSE)
x |
A |
exclude.1st |
A |
A vector
containing the cumulative sum.
Chris Brien
data(exampleData) Area.cum <- cumulate(longi.dat$Area)
data(exampleData) Area.cum <- cumulate(longi.dat$Area)
Add the following factors and covariates to a date frame containing imaging data from the Plant Accelerator: Zones, xZones, SHZones, ZLane, ZMainplots, Subplots and xMainPosn. It checks that the numbers of levels of the factors are consistent with the observed numbers of carts and observations.
designFactors(data, insertName = NULL, designfactorMethod = "LanePosition", nzones = 6, nlanesperzone = 4, nmainplotsperlane = 11, nsubplotspermain = 2)
designFactors(data, insertName = NULL, designfactorMethod = "LanePosition", nzones = 6, nlanesperzone = 4, nmainplotsperlane = 11, nsubplotspermain = 2)
data |
A Smarthouse, Snapshot.ID.Tag, XDays, xPosn and, if |
insertName |
A |
designfactorMethod |
A |
nzones |
A |
nlanesperzone |
A |
nmainplotsperlane |
A |
nsubplotspermain |
A |
The factors Zones, ZLane, ZMainplots and Subplots are derived for each Smarthouse based on the values of nzones
, nlanesperzone
, nmainplotsperlane
, nsubplotspermain
, Zones being the blocks in the split-plot design. Thus, the number of carts in each Smarthouse must be the product of these values and the number of observations must be the product of the numbers of smarthouse, carts and imagings for each cart. If this is not the case, it may be able to be achieved by including in data
rows for extra observations that have values for the Snapshot.ID.Tag, Smarthouse, Lane, Position and Time.after.Planting..d. and the remaining columns for these rows have missing values (NA
) Then SHZones is formed by combining Smarthouse and Zones and the covariates xZones and xMainPosn calculated. The covariate xZones is calculated from Zones and xMainPosn is formed from the mean of xPosn for each main plot.
A data.frame
including the columns:
Smarthouse: factor with levels for the Smarthouse
Zones: factor dividing the Lanes into groups, usually of 4 lanes
xZones: numeric corresponding to Zones, centred by subtracting the mean of the unique positions
SHZones: factor for the combinations of Smarthouse and Zones
ZLane: factor for the lanes within a Zone
ZMainplots: factor for the main plots within a Zone
Subplots: factor for the subplots
xMainPosn: numeric for the main-plot positions within a Lane, centred by subtracting the mean of the unique positions
Chris Brien
data(exampleData) longi.dat <- designFactors(longi.prime.dat, insertName = "xDays", nzones = 1, nlanesperzone = 1, nmainplotsperlane = 10, designfactorMethod="StandardOrder")
data(exampleData) longi.dat <- designFactors(longi.prime.dat, insertName = "xDays", nzones = 1, nlanesperzone = 1, nmainplotsperlane = 10, designfactorMethod="StandardOrder")
Imaging data for 20 of the plants from an experiment in a Smarthouse in the Plant Accelerator. It is used as a small example in the documentation for imageData
.
data(exampleData)
data(exampleData)
Four data.frames
: raw.dat (280 rows by 33 columns), longi.prime.dat (280 rows by 45 columns), longi.dat (280 rows by 63 columns), cart.dat (20 rows by 14 columns).
data.frame
Uses smooth.spline
to fit a spline to all the values of
response
stored in data
.
The amount of smoothing can be controlled by df
.
If df = NULL
, the amount of
smoothing is controlled by the default arguments and those you supply
for smooth.spline
. The method of Huang (2001) for correcting the
fitted spline for estimation bias at the end-points will be applied if
correctBoundaries
is TRUE
.
The derivatives of the fitted spline can also be obtained, and the
Relative Growth Rate (RGR) computed using them, provided
correctBoundaries
is FALSE
. Otherwise, growth rates can be
obtained by difference using splitContGRdiff
.
By default, smooth.spline
will issue an error if there are not
at least four distinct x-values. On the other hand, fitSplines
issues a warning and sets all smoothed values and derivatives to
NA
. The handling of missing values in the observations is
controlled via na.x.action
and na.y.action
.
fitSpline(data, response, x, df=NULL, smoothing.scale = "identity", correctBoundaries = FALSE, deriv=NULL, suffices.deriv=NULL, RGR=NULL, AGR=NULL, na.x.action="exclude", na.y.action = "exclude", ...)
fitSpline(data, response, x, df=NULL, smoothing.scale = "identity", correctBoundaries = FALSE, deriv=NULL, suffices.deriv=NULL, RGR=NULL, AGR=NULL, na.x.action="exclude", na.y.action = "exclude", ...)
data |
A |
response |
A |
x |
A |
df |
A |
smoothing.scale |
A |
correctBoundaries |
A |
deriv |
A |
suffices.deriv |
A |
RGR |
A |
AGR |
A |
na.x.action |
A |
na.y.action |
A |
... |
allows for arguments to be passed to |
A data.frame
containing x
and the fitted smooth. The names
of the columns will be the value of x
and the value of response
with .smooth
appended. The number of rows in the data.frame
will be equal to the number of pairs that have neither a missing x
or
response
and it will have the same order of codex as data
.
If deriv
is not NULL
, columns
containing the values of the derivative(s) will be added to the
data.frame
; the name each of these columns will be the value of
response
with .smooth.dvf
appended, where
f
is the order of the derivative, or the value of response
with .smooth.
and the corresponding element of
suffices.deriv
appended. If RGR
is not NULL
, the RGR
is calculated as the ratio of value of the first derivative of the fitted
spline and the fitted value for the spline.
Chris Brien
Huang, C. (2001). Boundary corrected cubic smoothing splines. Journal of Statistical Computation and Simulation, 70, 107-121.
splitSplines
, smooth.spline
,
predict.smooth.spline
, splitContGRdiff
data(exampleData) fit <- fitSpline(longi.dat, response="Area", , x="xDays", df = 4, deriv=c(1,2), suffices.deriv=c("AGRdv","Acc"))
data(exampleData) fit <- fitSpline(longi.dat, response="Area", , x="xDays", df = 4, deriv=c(1,2), suffices.deriv=c("AGRdv","Acc"))
responses
in data
that contains their values for the nominated timesForms a subset of responses
in data
that contains their values for the
nominated times.
getDates(responses, times.factor = "Days", data, which.times, suffix = NULL, include.times.factor = FALSE, include.individuals = FALSE, individuals = "Snapshot.ID.Tag")
getDates(responses, times.factor = "Days", data, which.times, suffix = NULL, include.times.factor = FALSE, include.individuals = FALSE, individuals = "Snapshot.ID.Tag")
responses |
A |
times.factor |
A |
data |
A |
which.times |
A |
suffix |
A |
include.times.factor |
A |
include.individuals |
A |
individuals |
A |
A data.frame
containing the subset of responses
ordered
by as many of the initial columns as are required to uniquely identify each row
(see order
for more information). The names of the columns for
responses
and times.factor
in the subset are the concatenation of
their names in data
and suffix
separated by a full stop.
Chris Brien
data(exampleData) AreaLast <- getDates("Area.smooth", data = longi.dat, which.times = c(42), suffix = "last")
data(exampleData) AreaLast <- getDates("Area.smooth", data = longi.dat, which.times = c(42), suffix = "last")
Calculates either the Absolute Growth Rate (AGR), Proportionate Growth
Rate (PGR) or Relative Growth Rate (RGR) between pairs of time points,
the second of which is lag
positions before the first
in x
.
AGRdiff(x, time.diffs, lag=1) PGR(x, time.diffs, lag=1) RGRdiff(x, time.diffs, lag=1)
AGRdiff(x, time.diffs, lag=1) PGR(x, time.diffs, lag=1) RGRdiff(x, time.diffs, lag=1)
x |
A |
time.diffs |
a |
lag |
A |
The AGRdiff is calculated as the difference between a pair of values divided by the time.diffs
.
The PGR is calculated as the ratio of a value to a second value which is lag
values
ahead of the first in x
and the ratio raised to the
power of the reciprocal of time.diffs
.
The RGRdiff is calculated as the log
of the PGR and so is equal to the difference between
the logarithms of a pair of values divided by the time.diffs
.
The differences and ratios are obtained using calcLagged
with lag = 1
.
A numeric
containing the growth rates which is the same length as x
and in which the first lag
values NA
.
Chris Brien
intervalGRaverage
, intervalGRdiff
, splitContGRdiff
, splitSplines
, calcLagged
data(exampleData) longi.dat$Area.AGR <- with(longi.dat, AGRdiff(Area, time.diffs = Days.diffs))
data(exampleData) longi.dat$Area.AGR <- with(longi.dat, AGRdiff(Area, time.diffs = Days.diffs))
Uses ggplot
to produce a plot of the time position within an interval
against the interval. For example, one might plot the hour of the day carts
are imaged against the days after planting (or some other number of
days after an event). A line is produced for each value of groupVariable
and the colour is varied according to the value of the colourVariable
.
Each Smarthouse
is plotted separately. It aids in checking
whether delays occurred in imaging the plants.
imagetimesPlot(data, intervals = "Time.after.Planting..d.", timePositions = "Hour", groupVariable = "Snapshot.ID.Tag", colourVariable = "Lane", ggplotFuncs = NULL)
imagetimesPlot(data, intervals = "Time.after.Planting..d.", timePositions = "Hour", groupVariable = "Snapshot.ID.Tag", colourVariable = "Lane", ggplotFuncs = NULL)
data |
A |
intervals |
A |
timePositions |
A |
groupVariable |
A |
colourVariable |
A |
ggplotFuncs |
A |
An object of class "ggplot
", which can be plotted using print
.
Chris Brien
data(exampleData) library(ggplot2) longi.dat <- calcTimes(longi.dat, imageTimes = "Snapshot.Time.Stamp", timePositions = "Hour") imagetimesPlot(data = longi.dat, intervals = "Days", timePositions = "Hour", ggplotFuncs=list(scale_colour_gradient(low="grey20", high="black"), geom_line(aes(group=Snapshot.ID.Tag, colour=Lane))))
data(exampleData) library(ggplot2) longi.dat <- calcTimes(longi.dat, imageTimes = "Snapshot.Time.Stamp", timePositions = "Hour") imagetimesPlot(data = longi.dat, intervals = "Days", timePositions = "Hour", ggplotFuncs=list(scale_colour_gradient(low="grey20", high="black"), geom_line(aes(group=Snapshot.ID.Tag, colour=Lane))))
Uses readxl
to import a sheet of imaging data produced by the
Lemna Tec Scanalyzer. Basically, the data consists of imaging data obtained from a
set of pots or carts over time. There should be a column, which by default is called
Snapshot.ID.Tag
, containing a unique identifier for each cart and a column,
which by default is labelled Snapshot.Time.Stamp
, containing
the time of imaging for each observation in a row of the sheet. Also, if
startTime
is not NULL
, calcTimes
is called to
calculate, or recalculate if already present, timeAfterStart
from
imageTimes
by subtracting a supplied startTime
.
Using cameraType
, keepCameraType
, labsCamerasViews
and
prefix2suffix
, some flexibility is provided for renaming the columns with
imaging data. For example, if the column names are prefixed with 'RGB_SV1', 'RGB_SV2'
or 'RGB_TV', the 'RGB_' can be removed and the 'SV1', 'SV2' or 'TV' become suffices.
importExcel(file, sheet="raw data", sep = ",", cartId = "Snapshot.ID.Tag", imageTimes = "Snapshot.Time.Stamp", timeAfterStart = "Time.after.Planting..d.", cameraType = "RGB", keepCameraType = FALSE, labsCamerasViews = NULL, prefix2suffix = TRUE, startTime = NULL, timeFormat = "%Y-%m-%d %H:%M", imagetimesPlot = TRUE, ...)
importExcel(file, sheet="raw data", sep = ",", cartId = "Snapshot.ID.Tag", imageTimes = "Snapshot.Time.Stamp", timeAfterStart = "Time.after.Planting..d.", cameraType = "RGB", keepCameraType = FALSE, labsCamerasViews = NULL, prefix2suffix = TRUE, startTime = NULL, timeFormat = "%Y-%m-%d %H:%M", imagetimesPlot = TRUE, ...)
file |
A |
sheet |
A |
sep |
A |
cartId |
A |
imageTimes |
A |
timeAfterStart |
A |
cameraType |
A |
keepCameraType |
A |
labsCamerasViews |
A named |
prefix2suffix |
A |
startTime |
A |
timeFormat |
A |
imagetimesPlot |
A |
... |
allows for arguments to be passed to |
A data.frame
containing the data.
Chris Brien
as.POSIXct
, calcTimes
, imagetimesPlot
## Not run: raw.0169.dat <- importExcel(file = "0169 analysis_20140603.xlsx", startTime = "2013-05-23 8:00 AM") camview.labels <- c("SF0", "SL0", "SU0", "TV0") names(camview.labels) <- c("RGB_Side_Far_0", "RGB_Side_Lower_0", "RGB_Side_Upper_0", "RGB_TV_0") raw.19.dat <- suppressWarnings(importExcel(file = "./data/raw19datarow.csv", cartId = "Snapshot.ID.Tags", startTime = "06/10/2017 0:00 AM", timeFormat = "%d/%m/%Y %H:M", labsCamerasViews = camview.labels, imagetimesPlot = FALSE)) ## End(Not run)
## Not run: raw.0169.dat <- importExcel(file = "0169 analysis_20140603.xlsx", startTime = "2013-05-23 8:00 AM") camview.labels <- c("SF0", "SL0", "SU0", "TV0") names(camview.labels) <- c("RGB_Side_Far_0", "RGB_Side_Lower_0", "RGB_Side_Upper_0", "RGB_TV_0") raw.19.dat <- suppressWarnings(importExcel(file = "./data/raw19datarow.csv", cartId = "Snapshot.ID.Tags", startTime = "06/10/2017 0:00 AM", timeFormat = "%d/%m/%Y %H:M", labsCamerasViews = camview.labels, imagetimesPlot = FALSE)) ## End(Not run)
Using previously calculated growth rates over time, calculates the Absolute Growth Rates for a specified interval using the weighted averages of AGRs for each time point in the interval (AGR) and the Relative Growth Rates for a specified interval using the weighted geometric means of RGRs for each time point in the interval (RGR).
intervalGRaverage(responses, individuals = "Snapshot.ID.Tag", which.rates = c("AGR","RGR"), suffices.rates=c("AGR","RGR"), start.time, end.time, times.factor = "Days", suffix.interval, data, sep=".", na.rm=TRUE)
intervalGRaverage(responses, individuals = "Snapshot.ID.Tag", which.rates = c("AGR","RGR"), suffices.rates=c("AGR","RGR"), start.time, end.time, times.factor = "Days", suffix.interval, data, sep=".", na.rm=TRUE)
responses |
A |
individuals |
A |
which.rates |
A |
suffices.rates |
A |
start.time |
A |
end.time |
A |
times.factor |
A |
suffix.interval |
A |
data |
A |
sep |
A |
na.rm |
A |
The AGR
for an interval is calculated as the weighted mean of the
AGRs for times within the interval. The RGR is calculated as the weighted
geometric mean of the RGRs for times within the interval; in fact the exponential is taken of the weighted means of the logs of the RGRs. The weights are
obtained from the times.factor
. They are taken as the sum of half the time subintervals before and after each time, except for the end points; the end points are taken to be the subintervals at the start and end of the interval.
A data.frame
with the growth rates.
The name of each column is the concatenation of (i) one of
responses
, (ii) one of AGR
, PGR
or RGR
,
or the appropriate element of suffices.rates
, and (iii)
suffix.interval
, the three components being separated by
full stops.
Chris Brien
intervalGRdiff
, intervalWUI
, splitValueCalculate
, getDates
, GrowthRates
, splitSplines
, splitContGRdiff
data(exampleData) longi.dat <- splitSplines(longi.dat, response="Area", x="xDays", INDICES = "Snapshot.ID.Tag", df = 4, deriv=1, suffices.deriv="AGRdv", RGR="RGRdv") Area.smooth.GR <- intervalGRaverage("Area.smooth", which.rates = c("AGR","RGR"), suffices.rates = c("AGRdv","RGRdv"), start.time = 31, end.time = 35, suffix.interval = "31to35", data = longi.dat)
data(exampleData) longi.dat <- splitSplines(longi.dat, response="Area", x="xDays", INDICES = "Snapshot.ID.Tag", df = 4, deriv=1, suffices.deriv="AGRdv", RGR="RGRdv") Area.smooth.GR <- intervalGRaverage("Area.smooth", which.rates = c("AGR","RGR"), suffices.rates = c("AGRdv","RGRdv"), start.time = 31, end.time = 35, suffix.interval = "31to35", data = longi.dat)
Using the values of the responses, calculates the specified combination of the Absolute Growth Rates using differences (AGR), the Proportionate Growth Rates (PGR) and Relative Growth Rates using log differences (RGR) between two nominated time points.
intervalGRdiff(responses, individuals = "Snapshot.ID.Tag", which.rates = c("AGR","PGR","RGR"), suffices.rates=NULL, times.factor = "Days", start.times, end.times, suffix.interval, data)
intervalGRdiff(responses, individuals = "Snapshot.ID.Tag", which.rates = c("AGR","PGR","RGR"), suffices.rates=NULL, times.factor = "Days", start.times, end.times, suffix.interval, data)
responses |
A |
individuals |
A |
which.rates |
A |
suffices.rates |
A |
times.factor |
A |
start.times |
A |
end.times |
A |
suffix.interval |
A |
data |
A |
The AGR
is calculated as the difference between the values of
response
at the end.times
and start.times
divided by the
difference between end.times
and start.times
.
The PGR is calculated as the ratio of response
at the end.times
to that at start.times
and the ratio raised to the power of the
reciprocal of the difference between end.times
and start.times
.
The RGR
is calculated as the log
of the PGR and so is equal to
the difference between the logarithms of response
at the end.times
and start.times
divided by the difference
between end.times
and start.times
.
A data.frame
with the growth rates.
The name of each column is the concatenation of (i) one of
responses
, (ii) one of AGR
, PGR
or RGR
,
or the appropriate element of suffices.rates
, and (iii)
suffix.interval
, the three components being separated by
full stops.
Chris Brien
intervalGRaverage
, intervalWUI
, getDates
, GrowthRates
, splitSplines
, splitContGRdiff
data(exampleData) Area.smooth.GR <- intervalGRdiff("Area.smooth", which.rates = c("AGR","RGR"), start.times = 31, end.times = 35, suffix.interval = "31to35", data = longi.dat)
data(exampleData) Area.smooth.GR <- intervalGRdiff("Area.smooth", which.rates = c("AGR","RGR"), start.times = 31, end.times = 35, suffix.interval = "31to35", data = longi.dat)
Principal Variable Analysis (PVA) (Cummings, 2007) selects a subset from a set of the variables such that the variables in the subset are as uncorrelated as possible, in an effort to ensure that all aspects of the variation in the data are covered. Here, all observations in a specified time interval are used for calculation the correlations on which the selection is based.
intervalPVA(responses, data, times.factor = "Days", start.time, end.time, nvarselect = NULL, p.variance = 1, include = NULL, plot = TRUE, ...)
intervalPVA(responses, data, times.factor = "Days", start.time, end.time, nvarselect = NULL, p.variance = 1, include = NULL, plot = TRUE, ...)
responses |
A |
data |
A |
times.factor |
A |
start.time |
A |
end.time |
A |
nvarselect |
A |
p.variance |
A |
include |
A |
plot |
A |
... |
allows passing of arguments to other functions. |
The variable that is most correlated with the other variables is selected first for inclusion. The partial correlation for each of the remaining variables, given the first selected variable, is calculated and the most correlated of these variables is selects for inclusion next. Then the partial correlations are adjust for the second included variables. This process is repeated until the specified criteria have been satisfied. The possibilities are to:
the default (nvarselect = NULL
and p.variance = 1
) select all variables in
increasing order of amount of information they provide;
select exactly nvarselect
variables;
select just enough variables, up to a maximum of nvarselect
variables, to explain
at least p.variance
*100 per cent of the total variance.
A data.frame
giving the results of the variable selection.
It will contain the columns Variable
, Selected
,
h.partial
, Added.Propn
and Cumulative.Propn
.
Chris Brien
Cumming, J. A. and D. A. Wood (2007) Dimension reduction via principal variables. Computational Statistics and Data Analysis, 52, 550–565.
data(exampleData) responses <- c("Area","Area.SV","Area.TV", "Image.Biomass", "Max.Height","Centre.Mass", "Density", "Compactness.TV", "Compactness.SV") results <- intervalPVA(responses, longi.dat, start.time = "31", end.time = "31", p.variance=0.9, plot = FALSE)
data(exampleData) responses <- c("Area","Area.SV","Area.TV", "Image.Biomass", "Max.Height","Centre.Mass", "Density", "Compactness.TV", "Compactness.SV") results <- intervalPVA(responses, longi.dat, start.time = "31", end.time = "31", p.variance=0.9, plot = FALSE)
Splits the values of a response into subsets corresponding individuals and applies a function that calculates a single value from each individual's observations during a specified time interval. It includes the ability to calculate the observation that corresponds to the calculated value of the function.
intervalValueCalculate(response, weights=NULL, individuals = "Snapshot.ID.Tag", FUN = "max", which.obs = FALSE, which.levels = NULL, start.time=NULL, end.time=NULL, times.factor = "Days", suffix.interval=NULL, data, sep=".", na.rm=TRUE, ...)
intervalValueCalculate(response, weights=NULL, individuals = "Snapshot.ID.Tag", FUN = "max", which.obs = FALSE, which.levels = NULL, start.time=NULL, end.time=NULL, times.factor = "Days", suffix.interval=NULL, data, sep=".", na.rm=TRUE, ...)
response |
A |
weights |
A |
individuals |
A |
FUN |
A |
which.obs |
A |
which.levels |
A |
start.time |
A |
end.time |
A |
times.factor |
A |
suffix.interval |
A |
data |
A |
na.rm |
A |
sep |
A |
... |
allows for arguments to be passed to |
A data.frame
, with the same number of rows as there are
individuals
, containing a column for the individuals
,
a column with the values of the function for the individuals
,
and a column with the values of the times.factor
. The name of
the column with the values of the function will be result of
concatenating the response
, FUN
and, if it is not
NULL
, suffix.interval
, each separated by a full stop.
Chris Brien
intervalGRaverage
, intervalGRdiff
, intervalWUI
, splitValueCalculate
, getDates
data(exampleData) Area.smooth.max <- intervalValueCalculate("Area.smooth", start.time = 31, end.time = 35, suffix.interval = "31to35", data = longi.dat)
data(exampleData) Area.smooth.max <- intervalValueCalculate("Area.smooth", start.time = 31, end.time = 35, suffix.interval = "31to35", data = longi.dat)
Calculates the Water Use Index (WUI) between two time points for a set of responses.
intervalWUI(responses, water.use = "Water.Use", individuals = "Snapshot.ID.Tag", times.factor = "Days", start.times, end.times, suffix.interval = NULL, data, include.total.water = FALSE, na.rm = FALSE)
intervalWUI(responses, water.use = "Water.Use", individuals = "Snapshot.ID.Tag", times.factor = "Days", start.times, end.times, suffix.interval = NULL, data, include.total.water = FALSE, na.rm = FALSE)
responses |
A |
water.use |
A |
individuals |
A |
times.factor |
A |
start.times |
A |
end.times |
A |
suffix.interval |
A |
data |
A |
include.total.water |
A |
na.rm |
A |
The WUI is calculated as the difference between the values of a response
at the end.times
and start.times
divided by the sum of the water use
after start.times
until end.times
. Thus, the water use up to
start.times
is not included.
A data.frame
containing the WUIs, the name of each
column being the concatenation of one of responses
,
WUI
and, if not NULL
, suffix.interval
, the three
components being separated by a full stop. If the total water is to be
included, the name of the column will be the concatenation of
water.use
, Total
and the suffix, each separated by a full
stop(‘.’).
Chris Brien
intervalGRaverage
, intervalGRdiff
, splitValueCalculate
, getDates
,
GrowthRates
data(exampleData) Area.smooth.WUI <- intervalWUI("Area", water.use = "Water.Loss", start.times = 31, end.times = 35, suffix = "31to35", data = longi.dat, include.total.water = TRUE)
data(exampleData) Area.smooth.WUI <- intervalWUI("Area", water.use = "Water.Loss", start.times = 31, end.times = 35, suffix = "31to35", data = longi.dat, include.total.water = TRUE)
Produce profile or longitudinal plots of the data from a Lemna Tec
Scananalyzer using ggplot
. A line is drawn for the data for
each Snapshot.ID.Tag
and the plot can be facetted so that a
grid of plots is produced.
longiPlot(data, x = "xDays+44.5", response = "Area", individuals="Snapshot.ID.Tag", x.title = "Days", y.title = "Area (1000 pixels)", title = NULL, facet.x = "Treatment.1", facet.y = "Smarthouse", labeller = NULL, colour = "black", colour.column=NULL, colour.values=NULL, alpha = 0.1, ggplotFuncs = NULL, printPlot = TRUE)
longiPlot(data, x = "xDays+44.5", response = "Area", individuals="Snapshot.ID.Tag", x.title = "Days", y.title = "Area (1000 pixels)", title = NULL, facet.x = "Treatment.1", facet.y = "Smarthouse", labeller = NULL, colour = "black", colour.column=NULL, colour.values=NULL, alpha = 0.1, ggplotFuncs = NULL, printPlot = TRUE)
data |
A |
x |
A |
response |
A |
individuals |
A |
x.title |
Title for the x-axis. |
y.title |
Title for the y-axis. |
title |
Title for the plot. |
facet.x |
A |
facet.y |
A |
labeller |
A |
colour |
A |
colour.column |
A |
colour.values |
A |
alpha |
A |
ggplotFuncs |
A |
printPlot |
A |
An object of class "ggplot
", which can be plotted using
print
.
Chris Brien
data(exampleData) longiPlot(data = longi.dat, response = "Area.smooth") plt <- longiPlot(data = longi.dat, response = "Area.smooth", x.title = "DAP", y.title = "Area.smooth", x="xDays+35.42857143", printPlot=FALSE) plt <- plt + ggplot2::geom_vline(xintercept=29, linetype="longdash", size=1) + ggplot2::scale_x_continuous(breaks=seq(28, 42, by=2)) + ggplot2::scale_y_continuous(limits=c(0,750)) print(plt) longiPlot(data = longi.dat, response = "Area.smooth", x.title = "DAP", y.title = "Area.smooth", x="xDays+35.42857143", ggplotFuncs = list(ggplot2::geom_vline(xintercept=29, linetype="longdash", size=1), ggplot2::scale_x_continuous(breaks=seq(28, 42, by=2)), ggplot2::scale_y_continuous(limits=c(0,750))))
data(exampleData) longiPlot(data = longi.dat, response = "Area.smooth") plt <- longiPlot(data = longi.dat, response = "Area.smooth", x.title = "DAP", y.title = "Area.smooth", x="xDays+35.42857143", printPlot=FALSE) plt <- plt + ggplot2::geom_vline(xintercept=29, linetype="longdash", size=1) + ggplot2::scale_x_continuous(breaks=seq(28, 42, by=2)) + ggplot2::scale_y_continuous(limits=c(0,750)) print(plt) longiPlot(data = longi.dat, response = "Area.smooth", x.title = "DAP", y.title = "Area.smooth", x="xDays+35.42857143", ggplotFuncs = list(ggplot2::geom_vline(xintercept=29, linetype="longdash", size=1), ggplot2::scale_x_continuous(breaks=seq(28, 42, by=2)), ggplot2::scale_y_continuous(limits=c(0,750))))
Forms the prime traits by selecting a subset of the traits in a data.frame of
imaging data produced by the Lemna Tec Scanalyzer. The imaging traits to be retained
are specified using the traits
and labsCamerasViews
arguments. Some imaging
traits are divided by 10000 to convert them from pixels to kilopixels.
Also added are factors and explanatory variates that might be of use in an analysis.
longitudinalPrime(data, cartId = "Snapshot.ID.Tag", imageTimes = "Snapshot.Time.Stamp", timeAfterStart = "Time.after.Planting..d.", idcolumns = c("Genotype.ID","Treatment.1"), traits = list(all = c("Area", "Boundary.Points.To.Area.Ratio", "Caliper.Length", "Compactness", "Convex.Hull.Area"), side = c("Center.Of.Mass.Y", "Max.Dist.Above.Horizon.Line")), labsCamerasViews = list(all = c("SV1", "SV2", "TV"), side = c("SV1", "SV2")), smarthouse.lev = NULL, calcWaterLoss = TRUE, pixelsPERcm)
longitudinalPrime(data, cartId = "Snapshot.ID.Tag", imageTimes = "Snapshot.Time.Stamp", timeAfterStart = "Time.after.Planting..d.", idcolumns = c("Genotype.ID","Treatment.1"), traits = list(all = c("Area", "Boundary.Points.To.Area.Ratio", "Caliper.Length", "Compactness", "Convex.Hull.Area"), side = c("Center.Of.Mass.Y", "Max.Dist.Above.Horizon.Line")), labsCamerasViews = list(all = c("SV1", "SV2", "TV"), side = c("SV1", "SV2")), smarthouse.lev = NULL, calcWaterLoss = TRUE, pixelsPERcm)
data |
A Smarthouse, Lane, Position, Weight.Before, Weight.After, Water.Amount, Projected.Shoot.Area..pixels. The defaults for the arguments to Smarthouse, Lane, Position, Weight.Before, Weight.After, Water.Amount, Projected.Shoot.Area..pixels., Area.SV1, Area.SV2, Area.TV, Boundary.Points.To.Area.Ratio.SV1, Boundary.Points.To.Area.Ratio.SV2, Boundary.Points.To.Area.Ratio.TV, Caliper.Length.SV1, Caliper.Length.SV2, Caliper.Length.TV, Compactness.SV1, Compactness.SV2, Compactness.TV, Convex.Hull.Area.SV1, Convex.Hull.Area.SV2, Convex.Hull.Area.TV, Center.Of.Mass.Y.SV1, Center.Of.Mass.Y.SV2, Max.Dist.Above.Horizon.Line.SV1, Max.Dist.Above.Horizon.Line.SV2. |
cartId |
A |
imageTimes |
A |
timeAfterStart |
A |
idcolumns |
A |
traits |
A |
labsCamerasViews |
A |
smarthouse.lev |
A |
calcWaterLoss |
A |
pixelsPERcm |
A |
The columns are copied from data, except for those columns in the list under Value that have ‘(calculated)’ appended.
A data.frame
containing the columns specified by cartId
,
imageTimes
, timeAfterStart
, idcolumns
, traits
and
cameras
. The defaults will result in the following columns:
Smarthouse: factor with levels for the Smarthouse
Lane: factor for lane number in a smarthouse
Position: factor for east/west position in a lane
Days: factor for the number of Days After Planting (DAP)
cartId
: unique code for each cart
imageTimes
: time at which an image was taken in POSIXct format
Reps: factor indexing the replicates for each combination of the factors in idcolumns
(calculated)
xPosn: numeric for the Positions within a Lane (calculated)
Hour: hour of the day, to 2 decimal places, at which the image was taken (calculated)
xDays: numeric for the DAP that is centred by subtracting the mean of the unique days (calculated)
idcolumns
: the columns listed in idcolumns
that have been converted to factors
Weight.Before: weight of the pot before watering (only if calcWaterLoss
is TRUE
)
Weight.After: weight of the pot after watering (only if calcWaterLoss
is TRUE
)
Water.Amount: the weight of the water added (= Water.After - Water.Before) (calculated)
Water.Loss: the difference between Weight.Before for the current imaging and the Weight.After for the previous imaging (calculated unless calcWaterLoss
is FALSE
)
Area: the Projected.Shoot.Area..pixels. divided by 1000 (calculated)
Area.SV1: the Projected.Shoot.Area from Side View 1 divided by 1000 (calculated)
Area.SV2: the Projected.Shoot.Area from Side View 2 divided by 1000 (calculated)
Area.TV: the Projected.Shoot.Area from Top View divided by 1000 (calculated)
Boundary.To.Area.Ratio.SV1
Boundary.To.Area.Ratio.SV2
Boundary.To.Area.Ratio.TV
Caliper.Length.SV1
Caliper.Length.SV2
Caliper.Length.TV
Compactness.SV1 from Side View 1
Compactness.SV2 from Side View 2
Compactness.TV: from Top View
Convex.Hull.Area.SV1: area of Side View 1 Convex Hull divided by 1000 (calculated)
Convex.Hull.Area.SV2: area of Side View 2 Convex Hull divided by 1000 (calculated)
Convex.Hull.TV: Convex.Hull.Area.TV divided by 1000 (calculated)
Center.Of.Mass.Y.SV1: Centre of Mass from Side View 1
Center.Of.Mass.Y.SV2: Centre of Mass from Side View 2
Max.Dist.Above.Horizon.Line.SV1: the Max.Dist.Above.Horizon.Line.SV1 converted to cm using pixelsPERcm
(calculated)
Max.Dist.Above.Horizon.Line.SV2: the Max.Dist.Above.Horizon.Line.SV2 converted to cm using pixelsPERcm
(calculated)
Chris Brien
data(exampleData) longiPrime.dat <- longitudinalPrime(data=raw.dat, smarthouse.lev=1) longiPrime.dat <- longitudinalPrime(data=raw.dat, smarthouse.lev=1, traits = list(a = "Area", c = "Compactness"), labsCamerasViews = list(all = c("SV1", "SV2", "TV"), t = "TV")) longiPrime.dat <- longitudinalPrime(data=raw.dat, smarthouse.lev=1, traits = c("Area.SV1", "Area.SV2", "Area.TV", "Compactness.TV"), labsCamerasViews = NULL) longiPrime.dat <- longitudinalPrime(data=raw.dat, smarthouse.lev=1, calcWaterLoss = FALSE, traits = list(img = c("Area", "Compactness"), H20 = c("Weight.Before","Weight.After", "Water.Amount")), labsCamerasViews = list(all = c("SV1", "SV2", "TV"), H2O = NULL))
data(exampleData) longiPrime.dat <- longitudinalPrime(data=raw.dat, smarthouse.lev=1) longiPrime.dat <- longitudinalPrime(data=raw.dat, smarthouse.lev=1, traits = list(a = "Area", c = "Compactness"), labsCamerasViews = list(all = c("SV1", "SV2", "TV"), t = "TV")) longiPrime.dat <- longitudinalPrime(data=raw.dat, smarthouse.lev=1, traits = c("Area.SV1", "Area.SV2", "Area.TV", "Compactness.TV"), labsCamerasViews = NULL) longiPrime.dat <- longitudinalPrime(data=raw.dat, smarthouse.lev=1, calcWaterLoss = FALSE, traits = list(img = c("Area", "Compactness"), H20 = c("Weight.Before","Weight.After", "Water.Amount")), labsCamerasViews = list(all = c("SV1", "SV2", "TV"), H2O = NULL))
Takes a response
and, for each individual
, uses
splitSplines
to smooth its values for each individual
using the degrees of freedom values in df
.
Provided get.rates
is TRUE
,
both the Absolute Growth Rates (AGR) and the Relative Growth Rates (RGR)
are calculated for each smooth, either using differences or first
derivatives. A combination of the unsmoothed and smoothed
values, as well as the AGR and RGR, can be plotted for each value in
df
. Note that the arguments that modify the plots apply to all
plots that are produced. The handling of missing values is controlled
via na.x.action
and na.y.action
probeDF(data, response = "Area", xname="xDays", individuals="Snapshot.ID.Tag", na.x.action="exclude", na.y.action = "exclude", df, smoothing.scale = "identity", correctBoundaries = FALSE, get.rates = TRUE, rates.method="differences", times.factor = "Days", x = NULL, x.title = NULL, facet.x = "Treatment.1", facet.y = "Smarthouse", labeller = NULL, colour = "black", colour.column=NULL, colour.values=NULL, alpha = 0.1, which.traits = c("response", "AGR", "RGR"), which.plots = "smoothedonly", deviations.boxplots = "none", ggplotFuncs = NULL, ...)
probeDF(data, response = "Area", xname="xDays", individuals="Snapshot.ID.Tag", na.x.action="exclude", na.y.action = "exclude", df, smoothing.scale = "identity", correctBoundaries = FALSE, get.rates = TRUE, rates.method="differences", times.factor = "Days", x = NULL, x.title = NULL, facet.x = "Treatment.1", facet.y = "Smarthouse", labeller = NULL, colour = "black", colour.column=NULL, colour.values=NULL, alpha = 0.1, which.traits = c("response", "AGR", "RGR"), which.plots = "smoothedonly", deviations.boxplots = "none", ggplotFuncs = NULL, ...)
data |
A |
response |
A |
xname |
A |
individuals |
A |
na.x.action |
A |
na.y.action |
A |
df |
A |
smoothing.scale |
A |
correctBoundaries |
A |
get.rates |
A |
rates.method |
A |
times.factor |
A |
x |
A |
x.title |
Title for the x-axis. If |
facet.x |
A |
facet.y |
A |
labeller |
A |
colour |
A |
colour.column |
A |
colour.values |
A |
alpha |
A |
which.traits |
A |
which.plots |
A |
deviations.boxplots |
A |
ggplotFuncs |
A |
... |
allows passing of arguments to |
A data.frame
containing individuals
,
times.factor
, facet.x
, facet.y
, xname
,
response
, and, for each df
, the smoothed
response, the AGR and the RGR. It is returned invisibly. The names of
the new data are constructed by joining elements separated by full
stops (.
). In all cases, the last element is the value of
df
. For the smoothed response, the other elements are
response
and "smooth"
; for AGR and RGR, the other elements
are the name of the smoothed response and either "AGR"
or
"RGR"
.
Chris Brien
splitSplines
, splitContGRdiff
, smooth.spline
, ggplot
.
data(exampleData) vline <- list(ggplot2::geom_vline(xintercept=20, linetype="longdash", size=1), ggplot2::scale_x_continuous(breaks=seq(12, 36, by=2))) probeDF(data = longi.dat, response = "Area", df = c(4,7), x="xDays+24.16666667", ggplotFuncs=vline)
data(exampleData) vline <- list(ggplot2::geom_vline(xintercept=20, linetype="longdash", size=1), ggplot2::scale_x_continuous(breaks=seq(12, 36, by=2))) probeDF(data = longi.dat, response = "Area", df = c(4,7), x="xDays+24.16666667", ggplotFuncs=vline)
Principal Variable Analysis (PVA) (Cummings, 2007) selects a subset from a set of the variables such that the variables in the subset are as uncorrelated as possible, in an effort to ensure that all aspects of the variation in the data are covered.
PVA(responses, data, nvarselect = NULL, p.variance = 1, include = NULL, plot = TRUE, ...)
PVA(responses, data, nvarselect = NULL, p.variance = 1, include = NULL, plot = TRUE, ...)
responses |
A |
data |
A |
nvarselect |
A |
p.variance |
A |
include |
A |
plot |
A |
... |
allows passing of arguments to other functions |
The variable that is most correlated with the other variables is selected first for inclusion. The partial correlation for each of the remaining variables, given the first selected variable, is calculated and the most correlated of these variables is selects for inclusion next. Then the partial correlations are adjust for the second included variables. This process is repeated until the specified criteria have been satisfied. The possibilities are:
the default (nvarselect = NULL
and p.variance = 1
), which selects all
variables in increasing order of amount of information they provide;
to select exactly nvarselect
variables;
to select just enough variables, up to a maximum of nvarselect
variables, to explain
at least p.variance
*100 per cent of the total variance.
A data.frame
giving the results of the variable selection.
It will contain the columns Variable
, Selected
,
h.partial
, Added.Propn
and Cumulative.Propn
.
Chris Brien
Cumming, J. A. and D. A. Wood (2007) Dimension reduction via principal variables. Computational Statistics and Data Analysis, 52, 550–565.
data(exampleData) responses <- c("Area","Area.SV","Area.TV", "Image.Biomass", "Max.Height","Centre.Mass", "Density", "Compactness.TV", "Compactness.SV") results <- PVA(responses, longi.dat, p.variance=0.9, plot = FALSE)
data(exampleData) responses <- c("Area","Area.SV","Area.TV", "Image.Biomass", "Max.Height","Centre.Mass", "Density", "Compactness.TV", "Compactness.SV") results <- PVA(responses, longi.dat, p.variance=0.9, plot = FALSE)
A measure of how correlated a variable is with those in a set is given by the
square root of the sum of squares of the correlation coefficients between the
variables and the other variables in the set (Cummings, 2007). Here, the partial
correlation between the subset of the variables listed in response
that
are not listed in include
is calculated from the partial correlation matrix
for the subset, adjusting for those variables in include
. This is useful
for manually deciding which of the variables not in include
should next be
added to it.
rcontrib(responses, data, include = NULL)
rcontrib(responses, data, include = NULL)
responses |
A |
data |
A |
include |
A |
A numeric
giving the correlation measures.
Chris Brien
Cumming, J. A. and D. A. Wood (2007) Dimension reduction via principal variables. Computational Statistics and Data Analysis, 52, 550–565.
data(exampleData) responses <- c("Area","Area.SV","Area.TV", "Image.Biomass", "Max.Height","Centre.Mass", "Density", "Compactness.TV", "Compactness.SV") h <- rcontrib(responses, longi.dat, include = "Area")
data(exampleData) responses <- c("Area","Area.SV","Area.TV", "Image.Biomass", "Max.Height","Centre.Mass", "Density", "Compactness.TV", "Compactness.SV") h <- rcontrib(responses, longi.dat, include = "Area")
The data is from an experiment in a Smarthouse in the Plant Accelerator. It is described in Al-Tamimi et al. (2016). It is used in imageData-package
as an executable example to illustrate the use of imageData
.
data(RiceRaw.dat)
data(RiceRaw.dat)
A data.frame containing 14784 observations on 33 variables.
It will be made available on Dryad
Al-Tamimi, N, Brien, C.J., Oakey, H., Berger, B., Saade, S., Ho, Y. S., Schmockel, S. M., Tester, M. and Negrao, S. (2016) New salinity tolerance loci revealed in rice using high-throughput non-invasive phenotyping. Nature Communications, 7, 13342.
data.frame
Uses AGRdiff
, PGR
and
RGRdiff
to calculate growth rates continuously
over time for a subset of the values of response
and
stores the results in data
. The subsets are those values
with the same levels combinations of the factors listed in
INDICES
.
splitContGRdiff(data, responses, INDICES, which.rates = c("AGR","PGR","RGR"), suffices.rates=NULL, times.factor = "Days")
splitContGRdiff(data, responses, INDICES, which.rates = c("AGR","PGR","RGR"), suffices.rates=NULL, times.factor = "Days")
data |
A |
responses |
A |
INDICES |
A |
which.rates |
A |
times.factor |
A |
suffices.rates |
A |
A data.frame
containing data
to which has been
added a column for the differences between the times.factor
,
if it is not already in data
, and
columns with growth rates. The name of the column for times.factor
differences will be the times.factor
with ".diff"
appended and,
for each of the growth-rate columns will
be the value of response
with one of ".AGR"
, ".PGR"
or "RGR"
or the corresponding value from suffices.GR
appended.
Chris Brien
data(exampleData) longi.dat <- splitContGRdiff(longi.dat, response="Area.smooth", INDICES = "Snapshot.ID.Tag", which.rates=c("AGR", "RGR"))
data(exampleData) longi.dat <- splitContGRdiff(longi.dat, response="Area.smooth", INDICES = "Snapshot.ID.Tag", which.rates=c("AGR", "RGR"))
data.frame
Uses fitSpline
to fit a spline to a subset of the values
of response
and stores the fitted values in data
.
The subsets are those values with the same levels combinations
of the factors listed in INDICES
and the degrees of
smoothing is controlled by df
. The derivatives
of the fitted spline can also be obtained, as can the Relative
Growth Rates (RGR).
By default, smooth.spline
will issue an error if there are not
at least four distinct x-values. On the other hand,
fitSpline
issues a warning and sets all smoothed values
and derivatives to NA
. The handling of missing values in the
observations is controlled via na.x.action
and na.y.action
.
splitSplines(data, response, x, INDICES, df = NULL, smoothing.scale = "identity", correctBoundaries = FALSE, deriv = NULL, suffices.deriv=NULL, RGR=NULL, AGR=NULL, sep=".", na.x.action="exclude", na.y.action = "exclude", ...)
splitSplines(data, response, x, INDICES, df = NULL, smoothing.scale = "identity", correctBoundaries = FALSE, deriv = NULL, suffices.deriv=NULL, RGR=NULL, AGR=NULL, sep=".", na.x.action="exclude", na.y.action = "exclude", ...)
data |
A |
response |
A |
x |
A |
INDICES |
A |
df |
A |
smoothing.scale |
A |
correctBoundaries |
A |
deriv |
A |
suffices.deriv |
A |
RGR |
A |
AGR |
A |
sep |
A |
na.x.action |
A |
na.y.action |
A |
... |
allows for arguments to be passed to |
A data.frame
containing data
to which has been
added a column with the fitted smooth, the name of the column being
response
with .smooth
appended to it. If deriv
is
not NULL
, columns containing the values of the derivative(s)
will be added to data
; the name each of these columns will
be the value of response
with .smooth.dvf
appended,
where f
is the order of the derivative, or the value of
response
with .smooth.
and the corresponding
element of suffices.deriv
appended. If RGR
is not
NULL
, the RGR is calculated as the ratio of value of the first
derivative of the fitted spline and the fitted value for the spline.
Any pre-existing smoothed and derivative columns in data
will be
replaced. The ordering of the data.frame
for the x
values will be preserved as far as is possible; the main difficulty
is with the handling of missing values by the function merge
.
Thus, if missing values in x
are retained, they will occur at
the bottom of each subset of INDICES
and the order will be
problematic when there are missing values in y
and
na.y.action
is set to omit
.
Chris Brien
Huang, C. (2001). Boundary corrected cubic smoothing splines. Journal of Statistical Computation and Simulation, 70, 107-121.
fitSpline
, smooth.spline
,
predict.smooth.spline
, splitContGRdiff
, split
data(exampleData) longi.dat <- splitSplines(longi.dat, response="Area", x="xDays", INDICES = "Snapshot.ID.Tag", df = 4, deriv=1, suffices.deriv="AGRdv", RGR="RGRdv")
data(exampleData) longi.dat <- splitSplines(longi.dat, response="Area", x="xDays", INDICES = "Snapshot.ID.Tag", df = 4, deriv=1, suffices.deriv="AGRdv", RGR="RGRdv")
Splits the values of a response into subsets corresponding individuals and applies a function that calculates a single value to each individual's observations. It includes the ability to calculate the observation that corresponds to the calculated value of the function.
splitValueCalculate(response, weights=NULL, individuals = "Snapshot.ID.Tag", FUN = "max", which.obs = FALSE, which.levels = NULL, data, na.rm=TRUE, sep=".", ...)
splitValueCalculate(response, weights=NULL, individuals = "Snapshot.ID.Tag", FUN = "max", which.obs = FALSE, which.levels = NULL, data, na.rm=TRUE, sep=".", ...)
response |
A |
weights |
A |
individuals |
A |
FUN |
A |
which.obs |
A |
which.levels |
A |
data |
A |
na.rm |
A |
sep |
A |
... |
allows for arguments to be passed to |
A data.frame
, with the same number of rows as there are
individuals
, containing the values of the function for the
individuals
.
Chris Brien
data(exampleData) Area.smooth.max <- splitValueCalculate("Area.smooth", data = longi.dat)
data(exampleData) Area.smooth.max <- splitValueCalculate("Area.smooth", data = longi.dat)
Takes pairs of values for a set of responses indexed by a two-level
treatment.factor
and calculates, for each of pair,
the result of applying a binary operation
to their values
for the two levels of the treatment.factor
. The level of the
treatment.factor
designated the control
will be
on the right of the binary operator and the value for the other
level will be on the left.
twoLevelOpcreate(responses, data, treatment.factor = "Treatment.1", suffices.treatment = c("Cont","Salt"), control = 1, columns.suffixed = NULL, operations = "/", suffices.results="OST", columns.retained = c("Snapshot.ID.Tag","Smarthouse","Lane", "Zones","xZones","SHZones","ZLane", "ZMainplots","xMainPosn", "Genotype.ID"), by = c("Smarthouse","Zones","ZMainplots"))
twoLevelOpcreate(responses, data, treatment.factor = "Treatment.1", suffices.treatment = c("Cont","Salt"), control = 1, columns.suffixed = NULL, operations = "/", suffices.results="OST", columns.retained = c("Snapshot.ID.Tag","Smarthouse","Lane", "Zones","xZones","SHZones","ZLane", "ZMainplots","xMainPosn", "Genotype.ID"), by = c("Smarthouse","Zones","ZMainplots"))
responses |
A |
data |
A |
treatment.factor |
A |
suffices.treatment |
A |
control |
A |
columns.suffixed |
A |
operations |
A |
suffices.results |
A |
columns.retained |
A |
by |
A |
A data.frame
containing the following columns and the values of the :
those from data
nominated in columns.retained
;
those containing the treated values of the columns whose names
are specified in responses
; the treated values are
those having the other level of treatment.factor
to
that specified by control
;
those containing the control
values of the columns whose
names are specified in responses
; the control values are
those having the level of treatment.factor
specified
by control
;
those containing the values calculated using the binary
operations
; the names of these columns will be
constructed from responses
by appending
suffices.results
to them.
Chris Brien
data(exampleData) responses <- c("Area.smooth.AGR","Area.smooth.RGR") cols.retained <- c("Snapshot.ID.Tag","Smarthouse","Lane","Position", "Days","Snapshot.Time.Stamp", "Hour", "xDays", "Zones","xZones","SHZones","ZLane","ZMainplots", "xMainPosn", "Genotype.ID") longi.SIIT.dat <- twoLevelOpcreate(responses, longi.dat, suffices.treatment=c("C","S"), operations = c("-", "/"), suffices.results = c("diff", "SIIT"), columns.retained = cols.retained, by = c("Smarthouse","Zones","ZMainplots","Days")) longi.SIIT.dat <- with(longi.SIIT.dat, longi.SIIT.dat[order(Smarthouse,Zones,ZMainplots,Days),])
data(exampleData) responses <- c("Area.smooth.AGR","Area.smooth.RGR") cols.retained <- c("Snapshot.ID.Tag","Smarthouse","Lane","Position", "Days","Snapshot.Time.Stamp", "Hour", "xDays", "Zones","xZones","SHZones","ZLane","ZMainplots", "xMainPosn", "Genotype.ID") longi.SIIT.dat <- twoLevelOpcreate(responses, longi.dat, suffices.treatment=c("C","S"), operations = c("-", "/"), suffices.results = c("diff", "SIIT"), columns.retained = cols.retained, by = c("Smarthouse","Zones","ZMainplots","Days")) longi.SIIT.dat <- with(longi.SIIT.dat, longi.SIIT.dat[order(Smarthouse,Zones,ZMainplots,Days),])
Calculates the Water Use Index, returning NA
if the water use is zero.
WUI(response, water)
WUI(response, water)
response |
A |
water |
A |
A numeric
containing the water divided by the response, unless water is
zero in which caseNA
is returned.
Chris Brien
data(exampleData) Area.WUE <- with(longi.dat, WUI(Area.AGR, Water.Loss))
data(exampleData) Area.WUE <- with(longi.dat, WUI(Area.AGR, Water.Loss))