When you are developing your model, you can use Workspaces to quickly execute code, see outputs, and make iterative improvements.
You previously learned how to start a Workspace and explored basic options.
In this topic, you will use your workspace to load, explore, and transform data. After the data has been prepared, you will train a model.
-
Click the browser tab to return to the Workspace.
-
Click New > Live Script to create a MATLAB Live Script.
-
Go to Save > Save As… to save it as
ber_hot_weather.mlx
. -
Copy and paste the following command:
opts = detectImportOptions('tegel.csv');
-
Click Run.
-
Copy and paste the following commands to load
tegel.csv
into MATLAB. Then, click Run. This loads the data using thereadtable()
function, giving it the import options object as a second argument.opts.SelectedVariableNames = {'DATE', 'PRCP', 'TMIN', 'TMAX'}; opts = setvartype(opts, {'DATE','PRCP','TMIN','TMAX'},{'datetime','double', 'double', 'double'}); berWeatherTbl = readtable("tegel.csv", opts); head(berWeatherTbl)
These commands loaded the following columns and assigned data types to them.
-
Date – date the temperature was read
-
PRCP – total precipitation for the day
-
TMIN – lowest temperature measured that day
-
TMAX – highest temperature measured that day
The result will look similar to the following:
-
-
Click Section Break to create a new section in your script.
NoteYour cursor must be at the end of the previous section. You might have to go to the Insert tab to find Section Break.
-
Copy and paste the following command to format the dates into a Year/Month/Day format and store each field as a table variable. This helps you examine the data.
[berWeatherTbl.year, berWeatherTbl.month, berWeatherTbl.day] = ymd(berWeatherTbl.DATE);
-
Copy and paste the following command to limit the dataset to temperatures between January 2000 and December 2019, inclusive, by removing rows with data outside this range. This speeds data processing.
berWeatherTbl = berWeatherTbl(berWeatherTbl.year > 1999 & berWeatherTbl.year < max(berWeatherTbl.year) , :);
-
Copy and paste the following commands to divide all temperature data by 10 to get the temperatures in full Celsius degrees. You might have noticed that temperatures in the TMAX or TMIN columns look a bit odd. This is because NOAA uses a temperature format consisting of a tenth-of-a-degree in the Celsius scale.
berWeatherTbl.TMAX = berWeatherTbl.TMAX/10; berWeatherTbl.TMIN = berWeatherTbl.TMIN/10;
-
Copy and paste the following command to complete missing data with interpolated information.
berWeatherTbl = fillmissing(berWeatherTbl, 'linear');
-
Copy and paste the following
head()
function to preview the start of the table.head(berWeatherTbl)
-
Click Run. The result looks like the following:
ImportantClick Save occasionally to save your script.
-
-
Click Section Break to create another section in your Live Script. In this section, you’ll calculate how many hot days have occurred in Berlin since the year 2000. To calculate this, copy and paste the following command to define a hot day as 29 degrees Celsius for the baseline threshold.
hotDayThreshold = 29;
-
Copy and paste the following command to calculate how many hot days have occurred since (and including) the year 2000. This command creates a table column indexing the days with maximum temperatures (TMAX) that meet or exceed the hot day threshold.
berWeatherTbl.HotDayFlag = berWeatherTbl.TMAX >= hotDayThreshold;
-
Copy and paste the following command to use
groupsummary()
to count how many hot days were flagged:numHotDaysPerYear = groupsummary(berWeatherTbl, 'year', 'sum', 'HotDayFlag');
-
Copy and paste the following command to repeat the same approach to find the highest temperature of each year:
maxTempOfYear = groupsummary(berWeatherTbl, 'year', 'max', 'TMAX');
-
Copy and paste the following command to combine the variables to create a table named
annualMaxTbl
:annualMaxTbl = join(numHotDaysPerYear, maxTempOfYear); annualMaxTbl.Properties.VariableNames = {'Year', 'daysInYear', 'hotDayCount', 'maxTemp'}; annualMaxTbl
-
Click Run Section. The table looks like the following:
-
-
Click Section Break to create another section in your Live Script. In this section, you’ll visualize the weather data using a chart with that combines a bar graph and line graph. The chart will use two y-axes.
The bar graph will represent the hot day count (for a given year), and the line graph will represent the highest annual temperature (in Celsius, for a given year). The y-axis on the left side of the chart will correspond to the hot day count, and the y-axis on the right side of the chart will correspond to the highest annual temperature.
-
Copy and paste the following to create a hot day count bar graph.
figure hold on yyaxis left bar(annualMaxTbl.Year, annualMaxTbl.hotDayCount, 'FaceColor', 'b');
-
Copy and paste the following to add a title and labels to the x-axis and left side y-axis.
titleText = sprintf("%s%d%s%d%s%d", "Number of hot days (over ", hotDayThreshold,"
-