LIS 4370 R Programming

Posts

Showing posts from March, 2024

Module 11 Debugging and Defensive Programming

March 24, 2024

The code we have been given: tukey_multiple <- function(x) { outliers <- array(TRUE,dim=dim(x)) for (j in 1:ncol(x)) { outliers[,j] <- outliers[,j] && tukey.outlier(x[,j]) } outlier.vec <- vector(length=nrow(x)) for (i in 1:nrow(x)) { outlier.vec[i] <- all(outliers[i,]) } return(outlier.vec) } Running this in R Studio produces the error: Error: unexpected symbol in: " for (i in 1:nrow(x)) { outlier.vec[i] <- all(outliers[i,]) } return" Looking at the code we can see that the bug is that we’re using the && operator inside the loop where we should be using the & operator. The && operator is not vectorized and only looks at the first element of the vector, whereas the & operator performs element-wise logical operations. Along with this, the return statement is causing an error as it is inside the second for loop, when it should...

Module #9 Visualization in R

March 10, 2024

For this assignment, I chose the “BenderlyZwick” dataset that was on the website provided to us. Using the following R code I created three visualizations, a histogram of “returns”, a scatter plot of “growth” vs “inflation”, and a box plot of “growth2” This is the histogram for “returns”, it represents the frequency distribution of the “returns” variable from the dataset. Each bar represents the count of “returns” that fall within the range defined by the bin width. For example, a bar at 0 returns with a height of 3 mean that there are 3 instances in the dataset where the “returns” value falls within the range of [0, 1). This is the histogram for “growth” vs “inflation”, each point represents a row in the dataset, with its x-coordinate corresponding to the “growth” value and its y-coordinate corresponding to the “inflation” value. From the scatter plot, it shows that the data points are more concentrated between 2.5 and 5.0 on the growth axis and between 0 and 6 on the infla...

Module #8 Input/Output, String Manipulation and plyr Package

March 03, 2024

To begin, I wrote out my version of the code that was provided to us. While it follows the same step-by-step process, I felt it would be more efficient simplifying it as well. This produced two tables for us to analyze, "Student" and "StudentAverage" For this, I will be analyzing the final table, StudentAverage. Running the function "summary(i_students) gives us a set of statistics to look at. We can also create a histogram using the information provided in table "StudentAverage" using the function "hist(i_students$Age, main = "Histogram of Ages", xlab = "Age")"