Scheduled Jobs

The Scheduled Jobs feature in Domino allows you to run a script on a regular basis.

You can create reports in MATLAB using the publish() function and the robust MATLAB Report Generator. To simplify things, we will use the publish() function, which uses a MATLAB m. file as a document template.

In our case, let’s imagine that we receive new data everyday about Berlin’s weather. We decide we want to generate a scheduled email visualizing this data, as well as the forecasted number of hot days in the next 365 days.

To do that, we will create two files:

  • An m. file, based on our live script, that will:
    • Load Berlin weather data from a URL
    • Prepare the data
    • Generate predictions using the model we create with the data
    • Call the publish() function
  • An m. file that will be our report template, displaying:
    • Weather prediction plot
    • Number of predicted hot days



Step 7.1: Publish the Weather Prediction Report

  1. Start a new MATLAB session.
  2. Create a new file named predictWeatherReport.m.
  3. Add the following code to your file. This code will initialize a struct to hold the results, define a hot day temperature threshold (in degrees Celsius), and download the current data for Berlin weather and save it to the workspace. As before, we will read the downloaded file into a table format.
%% Initial setup
result = struct;
hotDayThreshold = 30;
%% Download data file
urlString = "https://www.ncei.noaa.gov/data/global-historical-climatology-network-daily/access/GME00121150.csv";
if ~isfolder("data")
  mkdir('data');
end
savedFileName = sprintf("%s%s%s", "data", filesep, "berlin.csv");
websave(savedFileName, urlString);

%% Read the downloaded file
opts = detectImportOptions(savedFileName);
opts.SelectedVariableNames = {'DATE', 'PRCP', 'TMIN', 'TMAX'};
opts = setvartype(opts, {'DATE','PRCP','TMIN','TMAX'},{'datetime','double', 'double', 'double'});
stationWeatherTbl = readtable(savedFileName, opts);
  1. Let’s prepare the data for use. We will again start with data from the year 1999, adjust the temperature data to full degrees, and fill in any missing data. If there are less than 1000 rows of data, we will quit processing the data.
%%
[stationWeatherTbl.year, stationWeatherTbl.month, stationWeatherTbl.day] = ymd(stationWeatherTbl.DATE);
% MATLAB strength
stationWeatherTbl = stationWeatherTbl(stationWeatherTbl.year > 1999 & stationWeatherTbl.year < max(stationWeatherTbl.year), :);
stationWeatherTbl.TMAX = stationWeatherTbl.TMAX/10;
stationWeatherTbl.TMIN = stationWeatherTbl.TMIN/10;
stationWeatherTbl = fillmissing(stationWeatherTbl, 'linear');

%% check if there is enough data for prediction
dataRows = size(stationWeatherTbl, 1);
if dataRows < 1000
    disp('Not enough data for prediction');
    result.error = 'Not enough data for prediction';
    return;
end
  1. Next, let’s load the model we created for Berlin and save it into a .mat file. Then we’ll create the table to use as input with the updated data we read from the URL in step 3.
%% Check if we have a model for this weather station
modelFileName = sprintf("%s%s%s%s", "models", filesep, ... weatherStationId, ".mat");

% make sure we have a folder for the models
if ~isfolder('models')
  mkdir('models')
end

if ~isfile(modelFileName)
  disp('Training model for weather station...')
  cv = cvpartition(stationWeatherTbl.year, 'Holdout', 0.3);
  dataTrain = stationWeatherTbl(cv.training, :);

  [weatherModel, validationRMSE] = trainRegressionModel(dataTrain);

  % display prediction precision
  doneMessage = sprintf('%s%d', "Done. Model RMSE:", validationRMSE);
  disp(doneMessage);
  save(modelFileName, 'weatherModel');
else
  load(modelFileName, 'weatherModel');
end
%% Create table for future date prediction
todayDate = datetime('today');
daysIntoFuture = 365;
endDate = todayDate + days(daysIntoFuture);
predictedMaxTemps = table('Size', [daysIntoFuture+1 7], 'VariableTypes', ... {'datetime', 'double', 'double', 'double', 'double', 'double', 'double'}, ... 'VariableNames', stationWeatherTbl.Properties.VariableNames);
x=1;

for i=todayDate:endDate
  % get the average perception and minimum temps on this date
  [y, m, d] = ymd(i);

  minTemps = stationWeatherTbl.TMIN(stationWeatherTbl.month == m & stationWeatherTbl.day == d);
  prcps = stationWeatherTbl.PRCP(stationWeatherTbl.month == m & stationWeatherTbl.day == d);

  curMinTemp = NaN;
  [historicalRowCount z] = size(minTemps);
  randomRow = randi([1 historicalRowCount]);
  curMinTemp = minTemps(randomRow);
  predictedMaxTemps.TMIN(x) = curMinTemp;
  randomRow = randi([1 historicalRowCount]);
  predictedMaxTemps.PRCP(x) = prcps(randomRow);
  predictedMaxTemps.DATE(x) = i;
  predictedMaxTemps.year(x) = y;
  predictedMaxTemps.month(x) = m;
  predictedMaxTemps.day(x) = d;
  predictedMaxTemps.TMAX(x) = 0;
  x = x+1;
end
  1. Run the model and load the result struct with the prediction.
%%
yFit = weatherModel.predictFcn(predictedMaxTemps);
predResult = table(predictedMaxTemps.DATE, yFit, 'VariableNames', {'Date', 'Predicted TMAX'});
result.predictedTemps = predResult;
hotWeatherDaysIdx = predResult(predResult.("Predicted TMAX") > hotDayThreshold, :);
result.hotDayCountPrediction = height(hotWeatherDaysIdx);
  1. The publish() function runs in isolation from the workspace. As such, we need to share the prediction result with our template using a .mat file. To ensure the data file has a unique name for each run of this script, we will use the Domino environment variable for the run number.
%% save data to file
dominoRunId = getenv('DOMINO_RUN_NUMBER');
outputFileName = sprintf('%s%s%s', 'results', filesep, 'predictData_', string(dominoRunId));
save(outputFileName, 'result');
  1. Finally, call the publish() function. The report will be published to a subfolder of the results/ folder, along with the number of the current run in the filename.
%% Publish the report

% options for the report
pub_options.format = 'pdf';

% hide the report code
pub_options.showCode = false;
pub_options.outputDir = sprintf('%s%s%s', 'results', filesep, dominoRunId);
doc = publish(predictWeatherReportTemplate.m', pub_options);



Step 7.2: Create the Report Template

  1. Now let’s create the report template. Create a new file named predictWeatherReportTemplate.m and load the data. Use the following format for the filename when loading the data: results/predictData_<Domino Run Number>. We’ll store the data in a variable called result.
dominoRunId = getenv('DOMINO_RUN_NUMBER');
inputFileName = sprintf('%s%s%s', 'results', filesep, 'predictData_', string(dominoRunId));
load(inputFileName, 'result');
  1. Add a title to the template. Note that in a MATLAB template of this kind, comments will be rendered as markup. For more information, please read MATLAB’s documentation on markup comments for publishing.
%% Predicted Weather
  1. Finally, add the plot with the data we loaded and display the hot day predictions as output.
predResult = result.predictedTemps;

plot(predResult.Date, predResult.("Predicted TMAX"));
titleText = "Weather forecast for the next 365 days (\circC)";
title(titleText);
ylabel ('Forecasted Daily High Temperature')

%%
countPredictionText = sprintf("%s%d%s", "There will be ", ...
    result.hotDayCountPrediction, " hot days in the next 365 days");

disp(countPredictionText);
  1. Save and commit your changes using the Save & Push All button in the Domino toolbar.
  2. Let’s test the predictWeatherReport.m file. Call the file in the “Command Window”. A new folder and new file will be created inside of the results/ folder.

Image-7-2-1

Image-7-2-2

  1. Once again, save and commit your changes.
  2. Stop your MATLAB session and navigate to the “Scheduled Jobs” page in Domino.

Image-7-2-3

  1. Schedule a new job by clicking the + New Scheduled Job button.

Image-7-2-4

  1. Give the new scheduled job a name and select your hardware tier. Enter predictWeatherReport.m as the file to run for this job. Click Next.

Image-7-2-5

  1. Omit the “Spark Cluster” section, as we will not be attaching a Spark cluster.
  2. Schedule the job to run every weekday. Leave the “Run sequentially” option checked. Click Next.

Image-7-2-6

  1. Finally, enter the email addresses of individuals you like to notify when the job is complete. Click Create.

Image-7-2-7

You have now scheduled a job that will use your MATLAB code to generate a report. You can similarly run the job on an ad-hoc basis using the Domino “Jobs” function.

To discover more tips on how to customize the resulting email, refer to Domino’s documentation on Custom Notifications.