Posts

Module 11 Debugging and Defensive Programming

 The code we have been given: tukey_multiple <- function(x) {    outliers <- array(TRUE,dim=dim(x))    for (j in 1:ncol(x))     {     outliers[,j] <- outliers[,j] && tukey.outlier(x[,j])     } outlier.vec <- vector(length=nrow(x))     for (i in 1:nrow(x))     { outlier.vec[i] <- all(outliers[i,]) } return(outlier.vec) } Running this in R Studio produces the error: Error: unexpected symbol in: "  for (i in 1:nrow(x))   { outlier.vec[i] <- all(outliers[i,]) } return" Looking at the code we can see that the bug is that we’re using the && operator inside the loop where we should be using the & operator. The && operator is not vectorized and only looks at the first element of the vector, whereas the & operator performs element-wise logical operations. Along with this, the return statement is causing an error as it is inside the second for loop, when it should...

Module #9 Visualization in R

Image
 For this assignment, I chose the “BenderlyZwick” dataset that was on the website provided to us. Using the following R code I created three visualizations, a histogram of “returns”, a scatter plot of “growth” vs “inflation”, and a box plot of “growth2”  This is the histogram for “returns”, it represents the frequency distribution of the “returns” variable from the dataset. Each bar represents the count of “returns” that fall within the range defined by the bin width. For example, a bar at 0 returns with a height of 3 mean that there are 3 instances in the dataset where the “returns” value falls within the range of [0, 1). This is the histogram for “growth” vs “inflation”, each point represents a row in the dataset, with its x-coordinate corresponding to the “growth” value and its y-coordinate corresponding to the “inflation” value. From the scatter plot, it shows that the data points are more concentrated between 2.5 and 5.0 on the growth axis and between 0 and 6 on the infla...

Module #8 Input/Output, String Manipulation and plyr Package

Image
 To begin, I wrote out my version of the code that was provided to us. While it follows the same step-by-step process, I felt it would be more efficient simplifying it as well. This produced two tables for us to analyze, "Student" and "StudentAverage" For this, I will be analyzing the final table, StudentAverage. Running the function "summary(i_students) gives us a set of statistics to look at. We can also create a histogram using the information provided in table "StudentAverage" using the function "hist(i_students$Age, main = "Histogram of Ages", xlab = "Age")"

Module #7 R Object: S3 vs. S4 Assignment

Image
 To begin, we will use the provided mtcars dataset that is already packaged in R. Seeing as how the mtcars dataset is a data frame, any generic function should be able to be used with it.  Due to its nature as a data frame, mtcars is also classified as an S3 object. However, an S4 object can be created that represents a car in the mtcars dataset. 1. How do you tell what OO system (S3 vs. S4) an object is associated with? You can use the class function to check the class of an object. If the class is a basic data type like “numeric”, “character”, or “data.frame”, then it’s an S3 object. If the class is a custom class that you defined using setClass, then it’s an S4 object. 2. How do you determine the base type (like integer or list) of an object? You can use the typeof function to determine the base type of an object. 3. What is a generic function? A generic function is a function that behaves differently based on the class of the input object. Examples of generic functions in ...

Module #6 Doing Math Part 2

Image
 1. My R Code for question one is  Running this gives the following output: 2.  My R code for question two is This gives the following output: 3. My R code for question 3 is the following:  Which gives the output:

Module #5 Doing Math

Image
The purpose of this assignment is to learn more about how matrices function by finding the inverse and determinant of two matrices, A and B. We begin this process by creating the two matrices, A uses numbers 1 through 100 and reshapes it into a 10x10 matrix. B uses numbers 1 through 1000 and shapes it into 10x100 matrix.  Calculating the inverse of the matrices is quite simple in R and can be done with the solve() function. However, this does not guarantee that every matrix in R has an inverse which is immediately apparent when attempting to use this function on matrix A.  Upon running this function in R we are met with an error indicating there is no inverse. This is supported by running another function, the det() function which is used to find the determinant of A. Matrices cannot have an inverse if the determinant is zero.  Moving on to matrix B, we are met with another error running the solve() function on it.  This error indicates that matrix B is not square ...

Module #4 Programming Structure in R

Image
These are the two outputs created with my R code, the first being the boxplot and the second being a histogram.  Some of my observations in the boxplot are that the median blood pressure for the high category appears to be higher than that for the low category, the interquartile range is larger for the high category which suggests that there is more variability in blood pressure readings among those in that category, and that there may be an association between higher blood pressures and higher ratings by MDs.  The histogram provides a visual representation of the overall distribution of blood pressure for all patients.  Github Repository: https://github.com/matthewluu2002/R-Programming.git