domino logo
Tech Ecosystem
Get Started
Get started with Python
Step 0: Orient yourself to DominoStep 1: Create a projectStep 2: Configure your projectStep 3: Start a workspaceStep 4: Get your files and dataStep 5: Develop your modelStep 6: Clean up WorkspacesStep 7: Deploy your model
Get started with R
Step 0: Orient yourself to Domino (R Tutorial)Step 1: Create a projectStep 2: Configure your projectStep 3: Start a workspaceStep 4: Get your files and dataStep 5: Develop your modelStep 6: Clean up WorkspacesStep 7: Deploy your model
Get Started with MATLAB
Step 1: Orient yourself to DominoStep 2: Create a ProjectStep 3: Configure Your ProjectStep 4: Start a MATLAB WorkspaceStep 5: Fetch and Save Your DataStep 6: Develop Your ModelStep 7: Clean Up Your Workspace
Step 8: Deploy Your Model
Scheduled JobsLaunchers
Step 9: Working with Datasets
Domino Reference Projects
Search in Deployments
Security and Credentials
Secure Credential Storage
Store Project CredentialsStore User CredentialsStore Model Credentials
Get API KeyUse a Token for AuthenticationCreate a Mirror of Compute Environments
Collaborate
Share and Collaborate on Projects
Set Project VisibilityInvite CollaboratorsCollaborator Permissions
Add Comments
Reuse Work
Set Up ExportsSet Up Imports
Organizations
Organization PermissionsTransfer Projects to an Organization
Projects
Domino File System Projects
Domino File SystemOrganize Domino File System Project AssetsImport Git RepositoriesWork from a Commit ID in GitCopy a ProjectFork ProjectsMerge Projects
Manage Project Files
Upload Files to DominoCompare File RevisionsExclude Project Files From SyncExport Files as Python or R Package
Archive a Project
Revert Projects and Files
Revert a FileRevert a Project
Git-based Projects
Git-based Project Directory StructureCreate a Git-based ProjectCreate a New RepositoryOrganize Git-based Project AssetsDevelop Models in a WorkspaceSave Artifacts to the Domino File System
Project FilesSet Project SettingsStore Project Credentials
Project Goals
Add GoalsEdit GoalsLink Work to Goals
Organize Projects with TagsSet Project Stages
Project Status
Set Project as BlockedSet Project as CompleteSet Project as Unblocked
View Execution DetailsView Project ActivityTrack Project StatusRename a Project
Share and Collaborate
Set Project VisibilityInvite CollaboratorsCollaborator Permissions
Export and Import Project Content
Set Up ExportsSet Up Imports
See the Assets for Your ProjectPromote Projects to ProductionTransfer Project OwnershipIntegrate Jira
Domino Datasets
Manage Large DataDatasets Best PracticesCreate a DatasetUse an Existing DatasetFile Location of Datasets in Projects
Datasets and Snapshots
Update a DatasetAdd Tags to SnapshotsCreate a Snapshot of a DatasetDelete Snapshots of DatasetsDelete a Dataset
Upgrade from Versions Prior to 4.5
External Data
Considerations for Connecting to Data Sources
External Data Volumes
Mount an External VolumeView Mounted VolumesUse a Mounted VolumeUmount a Volume
Tips: Transfer Data Over a Network
Workspaces
Create a Workspace
Open a VS Code WorkspaceSet Custom Preferences for RStudio Workspaces
Workspace Settings
Edit Workspace SettingsChange Your Workspace's Volume SizeConfigure Long-Running Workspaces
Save Work in a WorkspaceSync ChangesView WorkspacesStop a WorkspaceResume a WorkspaceDelete a WorkspaceView Workspace LogsView Workspace UsageView Workspace HistoryWork with Legacy Workspaces
Use Git in Your Workspace
Commit and Push Changes to Your Git RepositoryCommit All Changes to Your Git RepositoryPull the Latest Changes from Your Git Repository
Run Multiple Applications in a Workspace
Clusters
Spark on Domino
Hadoop and Spark Overview
Connect to a Cloudera CDH5 cluster from DominoConnect to a Hortonworks cluster from DominoConnect to a MapR cluster from DominoConnect to an Amazon EMR cluster from DominoRun Local Spark on a Domino ExecutorUse PySpark in Jupyter WorkspacesKerberos Authentication
On-Demand Spark Overview
Validated Spark VersionConfigure PrerequisitesWork with your ClusterManage DependenciesWork with Data
On-Demand Ray Overview
Validated Ray VersionConfigure PrerequisitesWork with your ClusterManage DependenciesWork with Data
On-Demand Dask Overview
Validated Dask VersionConfigure PrerequisitesWork with Your ClusterManage DependenciesWork with Data
Environments
Set a Default EnvironmentCreate an EnvironmentEdit Environment DefinitionView Your EnvironmentsView Environment RevisionsDuplicate an EnvironmentArchive an Environment
Environments
Example: Create a New Environment
Customize Environments
Install Custom Packages with Git Integration
Add Packages to Environments
Use Dockerfile InstructionsUse requirements.txt (Python only)Use the Execution to Add a Package
Add Workspace IDEsAdd a Scala KernelAccess Additional Domains and HostnamesUse TensorBoard in Jupyter Workspaces
Use Partner Environments
Use MATLAB as a WorkspaceUse Stata as a WorkspaceAdd an NVIDIA NGC to DominoUse SAS as a Workspace
Executions
Execution StatesDomino Environment Variables
Jobs
Start a JobScheduled Jobs
Launchers
Launchers OverviewCreate a LauncherRun a LauncherCopy Launcher Definitions
View Job DetailsCompare JobsTag JobsStop JobsView Execution Performance
Execution Notifications
Set Notification PreferencesSet Custom Execution Notifications
Execution Results
Download Execution ResultsCustomize the Results DashboardAutomate Complex Pipelines with Apache Airflow
Model APIs
Configure a Model for Deployment
Scale Models
Scale Python ModelsScale Model Versions
Configure Compute ResourcesRoute Your ModelProject Files in ModelsEnvironments for ModelsShare and Collaborate on Models
Publish
Model APIs
Publish a ModelSend Test Calls to the ModelPublish a New Version of a ModelSelect How to Authorize a Model
Domino Apps
Publish a Domino AppHost HTML Pages from DominoGrant Access to Domino AppsView a Domino AppView All Domino AppsIdentify Resources to WhitelistPublish a Python App with DashPublish an R App with ShinyPublish a Project as a Website with FlaskOptimize App Scalability and PerformanceGet the Domino Username of an App Viewer
Launchers
Create a LauncherRun a LauncherCopy Launcher Definitions
Model Monitoring
Model Monitoring APIsAccessing The Model MonitorGet Started with Model MonitoringModel Monitor DeploymentIngest Data into The Model MonitorModel RegistrationMonitoring Data DriftMonitoring Model QualitySetting Scheduled Checks for the ModelConfigure Notification Channels for the ModelUse Model Monitoring APIsProduct Settings
Domino Command Line Interface (CLI)
Install the Domino Command Line Interface (CLI)Domino CLI ReferenceDownload Files with the CLIForce-Restore a Local ProjectMove a Project Between DeploymentsUse the Domino CLI Behind a Proxy
Troubleshooting
Troubleshoot Domino ModelsWork with Many FilesTroubleshoot Imports
Get Help
Additional ResourcesGet Domino VersionContact Technical SupportSupport BundlesBrowser SupportUser Guide Updates
domino logo
About Domino
Domino Data LabKnowledge BaseData Science BlogTraining
User Guide
>
Get Started
>
Get Started with MATLAB
>
Step 6: Develop Your Model

Step 6: Develop Your Model

When you are developing your model, you can use Workspaces to quickly execute code, see outputs, and make iterative improvements.

In Step 4, you learned how to start a Workspace and explored basic options.

In this topic, you will use your workspace to load, explore, and transform data. After the data has been prepared, you will train a model.

Step 6.1: Load and explore the dataset

  1. Click the browser tab to return to the Workspace.

  2. Click New > Live Script to create a MATLAB Live Script.

  3. Go to Save > Save As…​ to save it as ber_hot_weather.mlx.

  4. Copy and paste the following command:

    opts = detectImportOptions('tegel.csv');
  5. Click Run.

  6. Copy and paste the following commands to load tegel.csv into MATLAB. Then, click Run. This loads the data using the readtable() function, giving it the import options object as a second argument.

    opts.SelectedVariableNames = {'DATE', 'PRCP', 'TMIN', 'TMAX'};
    opts = setvartype(opts, {'DATE','PRCP','TMIN','TMAX'},{'datetime','double', 'double', 'double'});
    berWeatherTbl = readtable("tegel.csv", opts);
    head(berWeatherTbl)

    These commands loaded the following columns and assigned data types to them.

    • Date – date the temperature was read

    • PRCP – total precipitation for the day

    • TMIN – lowest temperature measured that day

    • TMAX – highest temperature measured that day

      The result will look similar to the following:

      matlab table

  7. Click Section Break to create a new section in your script.

    Note
    1. Copy and paste the following command to format the dates into a Year/Month/Day format and store each field as a table variable. This helps you examine the data.

      [berWeatherTbl.year, berWeatherTbl.month, berWeatherTbl.day] = ymd(berWeatherTbl.DATE);
    2. Copy and paste the following command to limit the dataset to temperatures between January 2000 and December 2019, inclusive, by removing rows with data outside this range. This speeds data processing.

      berWeatherTbl = berWeatherTbl(berWeatherTbl.year > 1999 & berWeatherTbl.year < max(berWeatherTbl.year) , :);
    3. Copy and paste the following commands to divide all temperature data by 10 to get the temperatures in full Celsius degrees. You might have noticed that temperatures in the TMAX or TMIN columns look a bit odd. This is because NOAA uses a temperature format consisting of a tenth-of-a-degree in the Celsius scale.

      berWeatherTbl.TMAX = berWeatherTbl.TMAX/10;
      berWeatherTbl.TMIN = berWeatherTbl.TMIN/10;
    4. Copy and paste the following command to complete missing data with interpolated information.

      berWeatherTbl = fillmissing(berWeatherTbl, 'linear');
    5. Copy and paste the following head() function to preview the start of the table.

      head(berWeatherTbl)
    6. Click Run. The result looks like the following:

      matlab table 2

      Important
  8. Click Section Break to create another section in your Live Script. In this section, you’ll calculate how many hot days have occurred in Berlin since the year 2000. To calculate this, copy and paste the following command to define a hot day as 29 degrees Celsius for the baseline threshold.

    hotDayThreshold = 29;
  9. Copy and paste the following command to calculate how many hot days have occurred since (and including) the year 2000. This command creates a table column indexing the days with maximum temperatures (TMAX) that meet or exceed the hot day threshold.

    berWeatherTbl.HotDayFlag = berWeatherTbl.TMAX >= hotDayThreshold;
    1. Copy and paste the following command to use groupsummary() to count how many hot days were flagged:

            numHotDaysPerYear = groupsummary(berWeatherTbl, 'year', 'sum', 'HotDayFlag');
    2. Copy and paste the following command to repeat the same approach to find the highest temperature of each year:

           maxTempOfYear = groupsummary(berWeatherTbl, 'year', 'max', 'TMAX');
    3. Copy and paste the following command to combine the variables to create a table named annualMaxTbl:

          annualMaxTbl = join(numHotDaysPerYear, maxTempOfYear);
           annualMaxTbl.Properties.VariableNames = {'Year', 'daysInYear', 'hotDayCount', 'maxTemp'};
           annualMaxTbl
    4. Click Run Section. The table looks like the following:

      matlab table 3

  10. Click Section Break to create another section in your Live Script. In this section, you’ll visualize the weather data using a chart with that combines a bar graph and line graph. The chart will use two y-axes.

    The bar graph will represent the hot day count (for a given year), and the line graph will represent the highest annual temperature (in Celsius, for a given year). The y-axis on the left side of the chart will correspond to the hot day count, and the y-axis on the right side of the chart will correspond to the highest annual temperature.

    • Copy and paste the following to create a hot day count bar graph.

      figure
      hold on
      yyaxis left
      bar(annualMaxTbl.Year,  annualMaxTbl.hotDayCount, 'FaceColor', 'b');
    • Copy and paste the following to add a title and labels to the x-axis and left side y-axis.

      titleText = sprintf("%s%d%s%d%s%d", "Number of hot days (over ", hotDayThreshold,"\circC) - ", min(annualMaxTbl.Year), "-", max(annualMaxTbl.Year));
      title(titleText)
      ylabel("Hot days per year")
      xlabel("Year")
    • Copy and paste the following to draw the line plot for the highest temperature each year.

      yyaxis right
      ylabel("Highest Annual Temperature in \circC")
      
      plot(annualMaxTbl.Year, annualMaxTbl.maxTemp, 'Color', 'r', "Marker","*")
      hold off
    • Click Run Section. Your chart should look something like the following:

      matlab table 4

Step 6.2: MATLAB - Generate predictions from data

In this section, you will use an interactive machine-learning MATLAB application called Regression Learner to develop a model that can predict the weather for the next 20 days.

Step 6.2.1: Partition the data

You must partition the data that will be used with Regression Learner into the following sets:

  • Data to train the model

  • Data to test the model

    1. Click Section Break in the Live Script and copy and paste the following code to remove the HotDayFlag column.

      berWeatherTbl.HotDayFlag = [];
    2. Copy and paste the following to partition the data.

      cv = cvpartition(berWeatherTbl.year, 'Holdout', 0.3);
      dataTrain = berWeatherTbl(cv.training, :);
      dataTest = berWeatherTbl(cv.test, :);
    3. Click Run Section.

Step 6.2.2: Train the model

  1. Click the APPS tab and click Regression Learner app. If you do not see the Regression Learner app, click the arrow to expand the full app list.

    regression learner icon

    Important

    The application opens in a new window.

  2. Click New Session and select From Workspace.

  3. Use the New Session window to specify the input variables for predictions in your model, as well as the outputs (or responses) you want to predict. In this tutorial, the output is the maximum temperature.

    1. For the input variable, from the Workspace Variable list, select dataTrain.

    2. For the output, in the Response section, select TMAX (maximum temperature).

    3. Select PRCP, TMIN, year, month, and day in the Predictors section.

      matlab variables

    4. Click Start Session. The Regression Learner window refreshes and shows the original data set and the values of TMAX.

      original train

  4. Select the type of model to be used for model training. For this tutorial, select Coarse Tree.

    1. Regression Learner runs best on a container with multiple cores because it can run in parallel and produce models rapidly. If you are using a single-core container, click Use Parallel in Regression Learner to turn off parallel processing.

    2. Click Train to start the model training process. The Domino container spins up a parallel pool which is a method to optimize the model training.

  5. Select Fine Gaussian SVM to compare the results to Coarse Tree. You can select additional models or even select all models and compare the results to identify the best fit for your data.

    1. Click the arrow to access the model types.

      access model types

      model selection

    2. Click Train. The model list automatically selects the model that best fits the data. Several visualizations are shown to demonstrate this.

      best performing model

    3. Click Predicted vs. Actual Plot open a chart that shows how many predictions the model made that fit correct values in the data. The closer the predictions are to the diagonal, the better the predictions.

      predicted vs actual

  6. Click Generate Function to use Regression Learner to create a function that will be used to deploy the model with Domino. MATLAB generates the function in an M-file. Click Save to save the file as trainRegressionModel.m.

    m file

Step 6.2.3: Export the model

  1. Click Export Model to export the model to your Domino workspace so you can use it for predictions.

    Note

    export model button

  2. Type a name for the model, such as weatherModel, and click OK.

  3. Close the Regression Learner app and you can see the trained model in your workspace.

    matlab model available

    Notice that the Command Window shows information about how to use the model to make predictions with the following line of code:

    yFit  = weatherModel.predictFcn(T);

    If you input a table of data, this line of code will output a prediction (as a table). The input table must include data organized like the data you used in berWeatherTbl – date, precipitation, minimum temperature, month, day and year. It must not include TMAX, as that value will be predicted. The model will predict the TMAX value and include it in yFit.

Step 6.2.4: Test the model

  1. To test the model with the data you partitioned earlier, create a Section Break in your Live Script (ber_hot_weather.mlx). Copy and paste the following to use the model with the test data and the function call that was listed in the Command Window.

    yFit  = weatherModel.predictFcn(dataTest);
  2. Copy and paste the following to compare the results column to the actual values in the test data set.

    err = yFit - dataTest.TMAX;
  3. Copy and paste the following to draw a histogram to visualize the results.

    figure;
    histogram(err)
    xlim([-15 15])
    ylabel('Number of predictions');
    xlabel('Gap with actual test data')
  4. Click Run Section. The result looks like the following:

    step 5.2 17

  5. To save the working model to be used later, copy and paste the following in the Command Window and then press Enter.

    save weatherModel weatherModel

    step 5.2 18

Step 6.2.5 Make predictions

You can use the model to predict the weather for next year. You’ll generate a table with next year’s dates and add randomly selected, historical precipitation and minimum temperature data to the table for those dates. This information helps the model make proper predictions.

  1. Create a new Section Break in your Live Script.

  2. Copy and paste the following to create a table with date and temperature input data.

    todayDate = datetime('today');
    daysIntoFuture = 365;
    endDate = todayDate + days(daysIntoFuture);
    predictedMaxTemps = table('Size', [daysIntoFuture+1 7], 'VariableTypes', {'datetime', 'double', 'double', 'double', 'double', 'double', 'double'}, 'VariableNames', berWeatherTbl.Properties.VariableNames);
    x=1;
  3. Copy and paste the following to loop through the next 20 days and populate the table.

    for i=todayDate:endDate
            [y, m, d] = ymd(i);
            minTemps = berWeatherTbl.TMIN(berWeatherTbl.month == m & berWeatherTbl.day == d);
            prcps = berWeatherTbl.PRCP(berWeatherTbl.month == m & berWeatherTbl.day == d);
        curMinTemp = NaN;
        [historicalRowCount z] = size(minTemps);
        randomRow = randi([1 historicalRowCount]);
        curMinTemp = minTemps(randomRow);
        predictedMaxTemps.TMIN(x) = curMinTemp;
        randomRow = randi([1 historicalRowCount]);
        predictedMaxTemps.PRCP(x) = prcps(randomRow);
        predictedMaxTemps.DATE(x) = i;
        predictedMaxTemps.year(x) = y;
        predictedMaxTemps.month(x) = m;
        predictedMaxTemps.day(x) = d;
        predictedMaxTemps.TMAX(x) = 0;
        x = x+1;
    end
    
    head(predictedMaxTemps)
  4. Click Run Section.

    The result is a preview of the table with historical weather data that you can use for weather predictions. The predictions will be listed in the TMAX column of the table after the table is run through the model.

    preview table

  5. To run the model, copy and paste the following into a new Section Break and run the Section.

    yFit = weatherModel.predictFcn(predictedMaxTemps);
    result = table(predictedMaxTemps.DATE, yFit, 'VariableNames', {'Date', 'Predicted TMAX'})

    The following is an AI-driven weather prediction.

    prediction

  6. Copy and paste the following code, and then Run Section to draw this in another plot and count how many hot days will be forecasted:

    figure
    plot(result.Date, result.("Predicted TMAX"))
    titleText = sprintf("%s%d%s", "Weather forecast for the next ", daysIntoFuture, " days in Berlin, Germany (\circC)");
    title(titleText)
    ylabel('Forecasted Daily High Temperature')

    plot graph

  7. Copy and paste the following code, and then Run Section to predict how many hot days will happen during the next year.

    hotWeatherDaysIdx = result(result.("Predicted TMAX") > hotDayThreshold, :);
    height(hotWeatherDaysIdx)

    The result on January 24, 2022 was a prediction of 0 hot days between January 2022 and February 2022. The results will vary based on the dates, data, and model used.

  8. To export your model, in the Command Window, type the following to save it into a MAT file:

    save weatherModel weatherModel

    Anyone in your Domino project can load it later with the following command:

    load weathermodel.mat
Domino Data LabKnowledge BaseData Science BlogTraining
Copyright © 2022 Domino Data Lab. All rights reserved.