<img src="http://oproject.org/img/ROOT.png" height="30%" width="30%">
<img src="http://oproject.org/img/tmvalogo.png" height="30%" width="30%">

<hr style="border-top-width: 4px; border-top-color: #34609b;">

## Enable ipywidgets

To be able to visualize decision trees and DNN weight map, you must enable ipywidgets. To do so, run the following cell, and refresh the page!

In [None]:
!jupyter nbextension enable --py widgetsnbextension

<!--<script src="JsRoot/scripts/JSRootCore.js?jq2d&onload=JsRootLoadedCall" type="text/javascript"></script>-->

In [1]:
import ROOT
from ROOT import TFile, TMVA, TCut

Welcome to JupyROOT 6.09/01


# Enable JS visualization

To use new interactive features in notebook we have to enable a module called JsMVA. This can be done by using ipython magic: %jsmva.

In [2]:
%jsmva on

# Declaration of Factory

First let's start with the classical version of declaration. If you know how to use TMVA in C++ then you can use that version here in python: first we need to pass a string called job name, as second argument we need to pass an opened output TFile (this is optional, if it's present then it will be used to store output histograms) and as third (or second) argument we pass a string which contains all the settings related to Factory (separated with ':' character).

## C++ like declaration

In [3]:
outputFile = TFile( "TMVA.root", 'RECREATE' )
TMVA.Tools.Instance();

factory = TMVA.Factory( "TMVAClassification", outputFile #this is optional
                       ,"!V:Color:DrawProgressBar:Transformations=I;D;P;G,D:AnalysisType=Classification" )

The options string can contain the following options:
<table>
<tr><th>Option</th><th>Default</th><th>Predefined values</th><th>Description</th></tr>
<tr>
 <td>V</td>
 <td>False</td>
 <td>-</td>
 <td>Verbose flag</td>
</tr>
<tr>
 <td>Color</td>
 <td>True</td>
 <td>-</td>
 <td>Flag for colored output</td>
</tr>
<tr>
 <td>Transformations</td>
 <td>""</td>
 <td>-</td>
 <td>List of transformations to test. For example with "I;D;P;U;G" string identity, decorrelation, PCA, uniform and Gaussian transformations will be applied</td>
</tr>
<tr>
 <td>Silent</td>
 <td>False</td>
 <td>-</td>
 <td>Batch mode: boolean silent flag inhibiting
any output from TMVA after
the creation of the factory class object</td>
</tr>
<tr>
 <td>DrawProgressBar</td>
 <td>True</td>
 <td>-</td>
 <td>Draw progress bar to display training,
testing and evaluation schedule (default:
True)</td>
</tr>
<tr>
 <td>AnalysisType</td>
 <td>Auto</td>
 <td>Classification,
Regression,
Multiclass, Auto</td>
 <td>Set the analysis type</td>
</tr>
</table>

## Pythonic version

By enabling JsMVA we have new, more readable ways to do the declaration (this applies to all functions, not just to the constructor).

In [4]:
factory = TMVA.Factory("TMVAClassification", TargetFile=outputFile,
                       V=False, Color=True, DrawProgressBar=True, Transformations=["I", "D", "P", "G", "D"],
                       AnalysisType="Classification")

Arguments of constructor:
The options string can contain the following options:
<table>
<tr><th>Keyword</th><th>Can be used as positional argument</th><th>Default</th><th>Predefined values</th><th>Description</th></tr>
<tr>
 <td>JobName</td>
 <td>yes, 1.</td>
 <td>not optional</td>
 <td>-</td>
 <td>Name of job</td>
</tr>
<tr>
 <td>TargetFile</td>
 <td>yes, 2.</td>
 <td>if not passed histograms won't be saved</td>
 <td>-</td>
 <td>File to write control and performance histograms histograms </td>
</tr>
<tr>
 <td>V</td>
 <td>no</td>
 <td>False</td>
 <td>-</td>
 <td>Verbose flag</td>
</tr>
<tr>
 <td>Color</td>
  <td>no</td>

 <td>True</td>
 <td>-</td>
 <td>Flag for colored output</td>
</tr>
<tr>
 <td>Transformations</td>
  <td>no</td>

 <td>""</td>
 <td>-</td>
 <td>List of transformations to test. For example with "I;D;P;U;G" string identity, decorrelation, PCA, uniform and Gaussian transformations will be applied</td>
</tr>
<tr>
 <td>Silent</td>
  <td>no</td>

 <td>False</td>

 <td>-</td>
 <td>Batch mode: boolean silent flag inhibiting
any output from TMVA after
the creation of the factory class object</td>
</tr>
<tr>
 <td>DrawProgressBar</td>
  <td>no</td>

 <td>True</td>
 <td>-</td>
 <td>Draw progress bar to display training,
testing and evaluation schedule (default:
True)</td>
</tr>
<tr>
 <td>AnalysisType</td>
  <td>no</td>

 <td>Auto</td>
 <td>Classification,
Regression,
Multiclass, Auto</td>
 <td>Set the analysis type</td>
</tr>
</table>

# Declaring the DataLoader, adding variables and setting up the dataset

First we need to declare a DataLoader and add the variables (passing the variable names used in the test and train trees in input dataset). To add variable names to DataLoader we use the AddVariable function. Arguments of this function:

1. String containing the variable name. Using ":=" we can add definition too.

2. String (label to variable, if not present the variable name will be used) or character (defining the type of data points)

3. If we have label for variable, the data point type still can be passed as third argument 

In [5]:
dataset = "tmva_class_example" #the dataset name
loader  = TMVA.DataLoader(dataset)

loader.AddVariable( "myvar1 := var1+var2", 'F' )
loader.AddVariable( "myvar2 := var1-var2", "Expression 2", 'F' )
loader.AddVariable( "var3",                "Variable 3", 'F' )
loader.AddVariable( "var4",                "Variable 4", 'F' )

It is possible to define spectator variables, which are part of the input data set, but which are not
used in the MVA training, test nor during the evaluation, but can be used for correlation tests or others. 
Parameters:

1. String containing the definition of spectator variable.
2. Label for spectator variable.
3. Data type

In [6]:
loader.AddSpectator( "spec1:=var1*2",  "Spectator 1",  'F' )
loader.AddSpectator( "spec2:=var1*3",  "Spectator 2",  'F' )

After adding the variables we have to add the datas to DataLoader. In order to do this we check if the dataset file doesn't exist in files directory we download from CERN's server. When we have the root file we open it and get the signal and background trees.

In [7]:
if ROOT.gSystem.AccessPathName( "tmva_class_example.root" ) != 0: 
    ROOT.gSystem.Exec( "wget https://root.cern.ch/files/tmva_class_example.root")
    
input = TFile.Open( "tmva_class_example.root" )

# Get the signal and background trees for training
signal      = input.Get( "TreeS" )
background  = input.Get( "TreeB" )

To pass the signal and background trees to DataLoader we use the AddSignalTree and AddBackgroundTree functions, and we set up the corresponding DataLoader variable's too.
Arguments of functions:

1. Signal/Background tree
2. Global weight used in all events in the tree.

In [8]:
# Global event weights (see below for setting event-wise weights)
signalWeight     = 1.0
backgroundWeight = 1.0

loader.AddSignalTree(signal, signalWeight)
loader.AddBackgroundTree(background, backgroundWeight)

loader.fSignalWeight = signalWeight
loader.fBackgroundWeight = backgroundWeight
loader.fTreeS = signal
loader.fTreeB = background

0,1,2,3
DataSetInfo,"Dataset: tmva_class_exampleAdded class ""Signal""",,
DataSetInfo,Dataset: tmva_class_example,"Added class ""Signal""",
Add Tree TreeS of type Signal with 6000 events,,,
DataSetInfo,"Dataset: tmva_class_exampleAdded class ""Background""",,
DataSetInfo,Dataset: tmva_class_example,"Added class ""Background""",
Add Tree TreeB of type Background with 6000 events,,,

0,1,2
Dataset: tmva_class_example,"Added class ""Signal""",

0,1,2
Dataset: tmva_class_example,"Added class ""Background""",


With using DataLoader.PrepareTrainingAndTestTree function we apply cuts on input events. In C++ this function also needs to add the options as a string (as we seen in Factory constructor) which with JsMVA can be passed (same as Factory constructor case) as keyword arguments.

Arguments of PrepareTrainingAndTestTree:
<table>

<tr>
    <th>Keyword</th>
    <th>Can be used as positional argument</th>
    <th>Default</th>
    <th>Predefined values</th>
    <th>Description</th>
</tr>

<tr>
    <td>SigCut</td>
    <td>yes, 1.</td>
    <td>-</td>
    <td>-</td>
    <td>TCut object for signal cut</td>
</tr>
<tr>
    <td>Bkg</td>
    <td>yes, 2.</td>
    <td>-</td>
    <td>-</td>
    <td>TCut object for background cut</td>
</tr>

<tr>
    <td>SplitMode</td>
    <td>no</td>
    <td>Random</td>
    <td>Random,
Alternate,
Block</td>
    <td>Method of picking training and testing
events</td>
</tr>
<tr>
    <td>MixMode</td>
    <td>no</td>
    <td>SameAsSplitMode</td>
    <td>SameAsSplitMode,
Random,
Alternate,
Block</td>
    <td>Method of mixing events of differnt
classes into one dataset</td>
</tr>
<tr>
    <td>SplitSeed</td>
    <td>no</td>
    <td>100</td>
    <td>-</td>
    <td>Seed for random event shuffling</td>
</tr>
<tr>
    <td>NormMode</td>
    <td>no</td>
    <td>EqualNumEvents</td>
    <td>None, NumEvents,
EqualNumEvents</td>
    <td>Overall renormalisation of event-by-event
weights used in the training (NumEvents:
average weight of 1 per
event, independently for signal and
background; EqualNumEvents: average
weight of 1 per event for signal,
and sum of weights for background
equal to sum of weights for signal)</td>
</tr>

<tr>
    <td>nTrain_Signal</td>
    <td>no</td>
    <td>0 (all)</td>
    <td>-</td>
    <td>Number of training events of class Signal</td>
</tr>

<tr>
    <td>nTest_Signal</td>
    <td>no</td>
    <td>0 (all)</td>
    <td>-</td>
    <td>Number of test events of class Signal</td>
</tr>

<tr>
    <td>nTrain_Background</td>
    <td>no</td>
    <td>0 (all)</td>
    <td>-</td>
    <td>Number of training events of class
Background</td>
</tr>

<tr>
    <td>nTest_Background </td>
    <td>no</td>
    <td>0 (all)</td>
    <td>-</td>
    <td>Number of test events of class Background</td>
</tr>
<tr>
    <td>V</td>
    <td>no</td>
    <td>False</td>
    <td>-</td>
    <td>Verbosity</td>
</tr>
<tr>
    <td>VerboseLevel</td>
    <td>no</td>
    <td>Info</td>
    <td>Debug, Verbose,
Info</td>
    <td>Verbosity level</td>
</tr>

</table>

In [9]:
mycuts = TCut("")
mycutb = TCut("")

loader.PrepareTrainingAndTestTree(SigCut=mycuts, BkgCut=mycutb,
                    nTrain_Signal=0, nTrain_Background=0, SplitMode="Random", NormMode="NumEvents", V=False)

# Visualizing input variables

In [10]:
loader.DrawInputVariable("myvar1")

0,1,2,3
DataSetFactory,Dataset: tmva_class_exampleNumber of events in input trees,,
DataSetFactory,Dataset: tmva_class_example,Number of events in input trees,
Number of training and testing eventsSignaltraining events3000testing events3000training and testing events6000Backgroundtraining events3000testing events3000training and testing events6000,Dataset: tmva_class_example,,
Number of training and testing events,Number of training and testing events,Number of training and testing events,Dataset: tmva_class_example
Signal,training events,3000,
Signal,testing events,3000,
Signal,training and testing events,6000,
Background,training events,3000,
Background,testing events,3000,
Background,training and testing events,6000,

0,1,2
Dataset: tmva_class_example,Number of events in input trees,
Dataset: tmva_class_example,,
Dataset: tmva_class_example,,

0,1,2,3
Number of training and testing events,Number of training and testing events,Number of training and testing events,
Signal,training events,3000,
Signal,testing events,3000,
Signal,training and testing events,6000,
Background,training events,3000,
Background,testing events,3000,
Background,training and testing events,6000,

0,1
Dataset: tmva_class_example,
Dataset: tmva_class_example,


## We can also visualize transformations on input variables

In [11]:
loader.DrawInputVariable("myvar1", processTrfs=["D", "N"]) #Transformations: I;N;D;P;U;G,D

0,1,2,3,4,5
DataLoader,"Dataset: tmva_class_exampleCreate Transformation ""D"" with events from all classes.",,,,
DataLoader,Dataset: tmva_class_example,"Create Transformation ""D"" with events from all classes.",,,
DataLoader,Dataset: tmva_class_example,"Transformation, Variable selection :",,,
DataLoader,Input : variable 'myvar1' <---> Output : variable 'myvar1',,,,
DataLoader,Input : variable 'myvar2' <---> Output : variable 'myvar2',,,,
DataLoader,Input : variable 'var3' <---> Output : variable 'var3',,,,
Input : variable 'var4' <---> Output : variable 'var4',,,,,
DataLoader,"Dataset: tmva_class_exampleCreate Transformation ""N"" with events from all classes.",,,,
DataLoader,Dataset: tmva_class_example,"Create Transformation ""N"" with events from all classes.",,,
DataLoader,Dataset: tmva_class_example,"Transformation, Variable selection :",,,

0,1,2
Dataset: tmva_class_example,"Create Transformation ""D"" with events from all classes.",
Dataset: tmva_class_example,,

0,1,2
Dataset: tmva_class_example,"Create Transformation ""N"" with events from all classes.",
Dataset: tmva_class_example,,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,-0.11202,1.0000,-3.8813,3.3150,
myvar2,-0.017404,1.0000,-3.7240,3.6440,
var3,-0.11241,1.0000,-3.7248,3.8805,
var4,0.32261,1.0000,-3.3662,3.1355,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.047564,0.27792,-1.0000,1.0000,
myvar2,0.0061262,0.27145,-1.0000,1.0000,
var3,-0.050040,0.26298,-1.0000,1.0000,
var4,0.13472,0.30761,-1.0000,1.0000,

0,1,2
Dataset: tmva_class_example,"Create Transformation ""D"" with events from all classes.",
Dataset: tmva_class_example,,

0,1,2
Dataset: tmva_class_example,"Create Transformation ""N"" with events from all classes.",
Dataset: tmva_class_example,,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,-0.11202,1.0000,-3.8813,3.3150,
myvar2,-0.017404,1.0000,-3.7240,3.6440,
var3,-0.11241,1.0000,-3.7248,3.8805,
var4,0.32261,1.0000,-3.3662,3.1355,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.047564,0.27792,-1.0000,1.0000,
myvar2,0.0061262,0.27145,-1.0000,1.0000,
var3,-0.050040,0.26298,-1.0000,1.0000,
var4,0.13472,0.30761,-1.0000,1.0000,


# Correlation matrix of input variables

In [12]:
loader.DrawCorrelationMatrix("Signal")

# Booking methods

To add which we want to train on dataset we have to use the Factory.BookMethod function. This method will add a method and it's options to Factory.

Arguments:
<table>

<tr>
    <th>Keyword</th>
    <th>Can be used as positional argument</th>
    <th>Default</th>
    <th>Predefined values</th>
    <th>Description</th>
</tr>

<tr>
    <td>DataLoader</td>
    <td>yes, 1.</td>
    <td>-</td>
    <td>-</td>
    <td>Pointer to DataLoader object</td>
</tr>

<tr>
    <td>Method</td>
    <td>yes, 2.</td>
    <td>-</td>
    <td>     kVariable
         kCuts           ,
         kLikelihood     ,
         kPDERS          ,
         kHMatrix        ,
         kFisher         ,
         kKNN            ,
         kCFMlpANN       ,
         kTMlpANN        ,
         kBDT            ,
         kDT             ,
         kRuleFit        ,
         kSVM            ,
         kMLP            ,
         kBayesClassifier,
         kFDA            ,
         kBoost          ,
         kPDEFoam        ,
         kLD             ,
         kPlugins        ,
         kCategory       ,
         kDNN            ,
         kPyRandomForest ,
         kPyAdaBoost     ,
         kPyGTB          ,
         kC50            ,
         kRSNNS          ,
         kRSVM           ,
         kRXGB           ,
         kMaxMethod</td>
    <td>Selected method number, method numbers defined in TMVA.Types</td>
</tr>
<tr>
    <td>MethodTitle</td>
    <td>yes, 3.</td>
    <td>-</td>
    <td>-</td>
    <td>Label for method</td>
</tr>
<tr>
    <td> * </td>
    <td> no </td>
    <td>-</td>
    <td>-</td>
    <td> Other named arguments which are the options for selected method. </td>
</tr>
</table>

In [13]:
factory.BookMethod( DataLoader=loader, Method=TMVA.Types.kSVM, MethodTitle="SVM", 
                Gamma=0.25, Tol=0.001, VarTransform="Norm" )

factory.BookMethod( loader,TMVA.Types.kMLP, "MLP", 
        H=False, V=False, NeuronType="tanh", VarTransform="N", NCycles=600, HiddenLayers="N+5",
                   TestRate=5, UseRegulator=False )

factory.BookMethod( loader,TMVA.Types.kLD, "LD", 
        H=False, V=False, VarTransform="None", CreateMVAPdfs=True, PDFInterpolMVAPdf="Spline2",
                   NbinsMVAPdf=50, NsmoothMVAPdf=10 )

factory.BookMethod( loader,TMVA.Types.kLikelihood,"Likelihood","NSmoothSig[0]=20:NSmoothBkg[0]=20:NSmoothBkg[1]=10",
    NSmooth=1, NAvEvtPerBin=50, H=True, V=False,TransformOutput=True,PDFInterpol="Spline2")

factory.BookMethod( loader, TMVA.Types.kBDT, "BDT",
    H=False, V=False, NTrees=850, MinNodeSize="2.5%", MaxDepth=3, BoostType="AdaBoost", AdaBoostBeta=0.5,
                   UseBaggedBoost=True, BaggedSampleFraction=0.5, SeparationType="GiniIndex", nCuts=20 )

<ROOT.TMVA::MethodBDT object ("BDT") at 0x6811880>

0,1,2,3
Factory,Booking method: SVM,,
Factory,,,
SVM,"Dataset: tmva_class_exampleCreate Transformation ""Norm"" with events from all classes.",,
SVM,Dataset: tmva_class_example,"Create Transformation ""Norm"" with events from all classes.",
SVM,Dataset: tmva_class_example,"Transformation, Variable selection :",
SVM,Input : variable 'myvar1' <---> Output : variable 'myvar1',,
SVM,Input : variable 'myvar2' <---> Output : variable 'myvar2',,
SVM,Input : variable 'var3' <---> Output : variable 'var3',,
Input : variable 'var4' <---> Output : variable 'var4',,,
Factory,Booking method: MLP,,

0,1,2
Dataset: tmva_class_example,"Create Transformation ""Norm"" with events from all classes.",
Dataset: tmva_class_example,,

0,1,2
Dataset: tmva_class_example,"Create Transformation ""N"" with events from all classes.",
Dataset: tmva_class_example,,


## Booking DNN: 2 ways (don't use both in the same time)

There is two way to book DNN:

1) The visual way: run the next cell, and design the network graphically and then click on "Save Network"

In [14]:
factory.BookDNN(loader)

2) Classical way

In [15]:
trainingStrategy = [{
        "LearningRate": 1e-1,
        "Momentum": 0.0,
        "Repetitions": 1,
        "ConvergenceSteps": 300,
        "BatchSize": 20,
        "TestRepetitions": 15,
        "WeightDecay": 0.001,
        "Regularization": "NONE",
        "DropConfig": "0.0+0.5+0.5+0.5",
        "DropRepetitions": 1,
        "Multithreading": True
        
    },  {
        "LearningRate": 1e-2,
        "Momentum": 0.5,
        "Repetitions": 1,
        "ConvergenceSteps": 300,
        "BatchSize": 30,
        "TestRepetitions": 7,
        "WeightDecay": 0.001,
        "Regularization": "L2",
        "DropConfig": "0.0+0.1+0.1+0.1",
        "DropRepetitions": 1,
        "Multithreading": True
        
    }, {
        "LearningRate": 1e-2,
        "Momentum": 0.3,
        "Repetitions": 1,
        "ConvergenceSteps": 300,
        "BatchSize": 40,
        "TestRepetitions": 7,
        "WeightDecay": 0.001,
        "Regularization": "L2",
        "Multithreading": True
        
    },{
        "LearningRate": 1e-3,
        "Momentum": 0.1,
        "Repetitions": 1,
        "ConvergenceSteps": 200,
        "BatchSize": 70,
        "TestRepetitions": 7,
        "WeightDecay": 0.001,
        "Regularization": "NONE",
        "Multithreading": True
        
}, {
        "LearningRate": 1e-3,
        "Momentum": 0.1,
        "Repetitions": 1,
        "ConvergenceSteps": 200,
        "BatchSize": 70,
        "TestRepetitions": 7,
        "WeightDecay": 0.001,
        "Regularization": "NONE",
        "Multithreading": True
        
}]

factory.BookMethod(DataLoader=loader, Method=TMVA.Types.kDNN, MethodTitle="DNN", 
                   H = False, V=False, VarTransform="Normalize", ErrorStrategy="CROSSENTROPY",
                   Layout=["TANH|100", "TANH|50", "TANH|10", "LINEAR"],
                   TrainingStrategy=trainingStrategy,Architecture="STANDARD")

<ROOT.TMVA::MethodDNN object ("DNN") at 0x6634270>

0,1,2,3
Factory,Booking method: DNN,,
Factory,,,
DNN,"Dataset: tmva_class_exampleCreate Transformation ""Normalize"" with events from all classes.",,
DNN,Dataset: tmva_class_example,"Create Transformation ""Normalize"" with events from all classes.",
DNN,Dataset: tmva_class_example,"Transformation, Variable selection :",
DNN,Input : variable 'myvar1' <---> Output : variable 'myvar1',,
DNN,Input : variable 'myvar2' <---> Output : variable 'myvar2',,
DNN,Input : variable 'var3' <---> Output : variable 'var3',,
Input : variable 'var4' <---> Output : variable 'var4',,,

0,1,2
Dataset: tmva_class_example,"Create Transformation ""Normalize"" with events from all classes.",
Dataset: tmva_class_example,,


# Train Methods

When you use the jsmva magic, the original C++ version of Factory::TrainAllMethods is rewritten by a new training method, which will produce notebook compatible output during the training, so we can trace the process (progress bar, error plot). For some methods (MLP, DNN, BDT) there will be created a tracer plot (for MLP, DNN test and training error vs epoch, for BDT error fraction and boost weight vs tree number). There are also some method which doesn't support interactive tracing, so for these methods just a simple text will be printed, just to we know that TrainAllMethods function is training this method currently.

For methods where is possible to trace the training interactively there is a stop button, which can stop the training process. This button just stops the training of the current method, and doesn't stop the TrainAllMethods completely. 

In [16]:
factory.TrainAllMethods()

0,1,2,3,4,5,6
TFHandler_SVM,VariableMeanRMSMinMaxmyvar10.0839890.36407-1.00001.0000myvar20.00947780.27696-1.00001.0000var30.0802790.36720-1.00001.0000var40.129860.39603-1.00001.0000,,,,,
TFHandler_SVM,Variable,Mean,RMS,Min,Max,
TFHandler_SVM,myvar1,0.083989,0.36407,-1.0000,1.0000,
TFHandler_SVM,myvar2,0.0094778,0.27696,-1.0000,1.0000,
TFHandler_SVM,var3,0.080279,0.36720,-1.0000,1.0000,
TFHandler_SVM,var4,0.12986,0.39603,-1.0000,1.0000,
Building SVM Working Set...with 6000 event instances,,,,,,
Elapsed time for Working Set build : 1.24 sec,,,,,,
"Sorry, no computing time forecast available for SVM, please wait ...",,,,,,
Elapsed time : 1.68 sec,,,,,,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.083989,0.36407,-1.0000,1.0000,
myvar2,0.0094778,0.27696,-1.0000,1.0000,
var3,0.080279,0.36720,-1.0000,1.0000,
var4,0.12986,0.39603,-1.0000,1.0000,

0,1,2
Dataset: tmva_class_example,Evaluation of SVM on training sample (6000 events),

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.083989,0.36407,-1.0000,1.0000,
myvar2,0.0094778,0.27696,-1.0000,1.0000,
var3,0.080279,0.36720,-1.0000,1.0000,
var4,0.12986,0.39603,-1.0000,1.0000,

0,1,2
Dataset: tmva_class_example,Evaluation of MLP on training sample (6000 events),

0,1,2
Dataset: tmva_class_example,Evaluation of LD on training sample (6000 events),

0,1,2
Dataset: tmva_class_example,Separation from histogram (PDF): 0.452 (0.000),
Dataset: tmva_class_example,Evaluation of LD on training sample,

0,1,2
Dataset: Likelihood,[0m,
Dataset: Likelihood,,

0,1,2
Dataset: tmva_class_example,Evaluation of Likelihood on training sample (6000 events),

0,1,2
Dataset: tmva_class_example,Evaluation of BDT on training sample (6000 events),

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.083989,0.36407,-1.0000,1.0000,
myvar2,0.0094778,0.27696,-1.0000,1.0000,
var3,0.080279,0.36720,-1.0000,1.0000,
var4,0.12986,0.39603,-1.0000,1.0000,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.083989,0.36407,-1.0000,1.0000,
myvar2,0.0094778,0.27696,-1.0000,1.0000,
var3,0.080279,0.36720,-1.0000,1.0000,
var4,0.12986,0.39603,-1.0000,1.0000,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.075113,0.36776,-1.1074,1.0251,
myvar2,0.0075595,0.27349,-0.90663,1.0008,
var3,0.070228,0.37106,-1.0649,1.0602,
var4,0.12090,0.39854,-1.1871,1.0199,

0,1,2
Dataset: tmva_class_example,Evaluation of DNN on training sample (6000 events),


# Test end evaluate the methods

To test test the methods and evaluate the performance we need to run Factory.TestAllMethods and Factory.EvaluateAllMethods functions.

In [17]:
factory.TestAllMethods()
factory.EvaluateAllMethods()

0,1,2,3,4,5,6
Factory,Test all methods,,,,,
Factory,Test method: SVM for Classification performance,,,,,
Factory,,,,,,
SVM,Dataset: tmva_class_exampleEvaluation of SVM on testing sample (6000 events),,,,,
SVM,Dataset: tmva_class_example,Evaluation of SVM on testing sample (6000 events),,,,
Elapsed time for evaluation of 6000 events : 0.983 sec,,,,,,
Factory,Test method: MLP for Classification performance,,,,,
Factory,,,,,,
MLP,Dataset: tmva_class_exampleEvaluation of MLP on testing sample (6000 events),,,,,
MLP,Dataset: tmva_class_example,Evaluation of MLP on testing sample (6000 events),,,,

0,1,2
Dataset: tmva_class_example,Evaluation of SVM on testing sample (6000 events),

0,1,2
Dataset: tmva_class_example,Evaluation of MLP on testing sample (6000 events),

0,1,2
Dataset: tmva_class_example,Evaluation of LD on testing sample (6000 events),

0,1,2
Dataset: tmva_class_example,Evaluation of LD on testing sample,

0,1,2
Dataset: tmva_class_example,Evaluation of Likelihood on testing sample (6000 events),

0,1,2
Dataset: tmva_class_example,Evaluation of BDT on testing sample (6000 events),

0,1,2
Dataset: tmva_class_example,Evaluation of DNN on testing sample (6000 events),

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.075113,0.36776,-1.1074,1.0251,
myvar2,0.0075595,0.27349,-0.90663,1.0008,
var3,0.070228,0.37106,-1.0649,1.0602,
var4,0.12090,0.39854,-1.1871,1.0199,

0,1,2
Dataset: tmva_class_example,Loop over test events and fill histograms with classifier response...,
Dataset: tmva_class_example,,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.075113,0.36776,-1.1074,1.0251,
myvar2,0.0075595,0.27349,-0.90663,1.0008,
var3,0.070228,0.37106,-1.0649,1.0602,
var4,0.12090,0.39854,-1.1871,1.0199,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.075113,0.36776,-1.1074,1.0251,
myvar2,0.0075595,0.27349,-0.90663,1.0008,
var3,0.070228,0.37106,-1.0649,1.0602,
var4,0.12090,0.39854,-1.1871,1.0199,

0,1,2
Dataset: tmva_class_example,Loop over test events and fill histograms with classifier response...,
Dataset: tmva_class_example,,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.075113,0.36776,-1.1074,1.0251,
myvar2,0.0075595,0.27349,-0.90663,1.0008,
var3,0.070228,0.37106,-1.0649,1.0602,
var4,0.12090,0.39854,-1.1871,1.0199,

0,1,2
Dataset: tmva_class_example,Loop over test events and fill histograms with classifier response...,
Dataset: tmva_class_example,,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,-0.010814,3.0633,-9.8605,7.9024,
myvar2,0.00090552,1.1092,-3.7067,4.0291,
var3,-0.015118,1.7459,-5.3563,4.6430,
var4,0.14331,2.1667,-6.9675,5.0307,

0,1,2
Dataset: tmva_class_example,Loop over test events and fill histograms with classifier response...,
Dataset: tmva_class_example,,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,-0.010814,3.0633,-9.8605,7.9024,
myvar2,0.00090552,1.1092,-3.7067,4.0291,
var3,-0.015118,1.7459,-5.3563,4.6430,
var4,0.14331,2.1667,-6.9675,5.0307,

0,1,2
Dataset: tmva_class_example,Loop over test events and fill histograms with classifier response...,
Dataset: tmva_class_example,,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,-0.010814,3.0633,-9.8605,7.9024,
myvar2,0.00090552,1.1092,-3.7067,4.0291,
var3,-0.015118,1.7459,-5.3563,4.6430,
var4,0.14331,2.1667,-6.9675,5.0307,

0,1,2
Dataset: tmva_class_example,Loop over test events and fill histograms with classifier response...,
Dataset: tmva_class_example,,

0,1,2,3,4,5
Variable,Mean,RMS,Min,Max,
myvar1,0.075113,0.36776,-1.1074,1.0251,
myvar2,0.0075595,0.27349,-0.90663,1.0008,
var3,0.070228,0.37106,-1.0649,1.0602,
var4,0.12090,0.39854,-1.1871,1.0199,


# Classifier Output Distributions

To draw the classifier output distribution we have to use Factory.DrawOutputDistribution function which is inserted by invoking jsmva magic. The parameters of the function are the following:
The options string can contain the following options:
<table>
<tr><th>Keyword</th><th>Can be used as positional argument</th><th>Default</th><th>Predefined values</th><th>Description</th></tr>
<tr>
    <td>datasetName</td>
    <td>yes, 1.</td>
    <td>-</td>
    <td>-</td>
    <td> The name of dataset</td>
</tr>
<tr>
    <td>methodName</td>
    <td>yes, 2.</td>
    <td>-</td>
    <td>-</td>
    <td> The name of method</td>
</tr>
</table>

In [18]:
factory.DrawOutputDistribution(dataset, "MLP")

## Classifier Probability Distributions

To draw the classifier probability distribution we have to use Factory.DrawProbabilityDistribution function which is inserted by invoking jsmva magic. The parameters of the function are the following:
The options string can contain the following options:
<table>
<tr><th>Keyword</th><th>Can be used as positional argument</th><th>Default</th><th>Predefined values</th><th>Description</th></tr>
<tr>
    <td>datasetName</td>
    <td>yes, 1.</td>
    <td>-</td>
    <td>-</td>
    <td> The name of dataset</td>
</tr>
</table>

In [19]:
factory.DrawProbabilityDistribution(dataset, "LD")

# ROC curve

To draw the ROC (receiver operating characteristic) curve we have to use Factory.DrawROCCurve function which is inserted by invoking jsmva magic. The parameters of the function are the following:
The options string can contain the following options:
<table>
<tr><th>Keyword</th><th>Can be used as positional argument</th><th>Default</th><th>Predefined values</th><th>Description</th></tr>
<tr>
    <td>datasetName</td>
    <td>yes, 1.</td>
    <td>-</td>
    <td>-</td>
    <td> The name of dataset</td>
</tr>
</table>

In [20]:
factory.DrawROCCurve(dataset)

# Classifier Cut Efficiencies

To draw the classifier cut efficiencies we have to use Factory.DrawCutEfficiencies function which is inserted by invoking jsmva magic. The parameters of the function are the following:
The options string can contain the following options:
<table>
<tr><th>Keyword</th><th>Can be used as positional argument</th><th>Default</th><th>Predefined values</th><th>Description</th></tr>
<tr>
    <td>datasetName</td>
    <td>yes, 1.</td>
    <td>-</td>
    <td>-</td>
    <td> The name of dataset</td>
</tr>
<tr>
    <td>methodName</td>
    <td>yes, 2.</td>
    <td>-</td>
    <td>-</td>
    <td> The name of method</td>
</tr>
</table>

In [21]:
factory.DrawCutEfficiencies(dataset, "MLP")

## Draw Neural Network

If we trained a neural network then the weights of the network will be saved to XML and C file. We can read back the XML file and we can visualize the network using Factory.DrawNeuralNetwork function.

The arguments of this function:
<table>
<tr><th>Keyword</th><th>Can be used as positional argument</th><th>Default</th><th>Predefined values</th><th>Description</th></tr>
<tr>
    <td>datasetName</td>
    <td>yes, 1.</td>
    <td>-</td>
    <td>-</td>
    <td> The name of dataset</td>
</tr>
<tr>
    <td>methodName</td>
    <td>yes, 2.</td>
    <td>-</td>
    <td>-</td>
    <td> The name of method</td>
</tr>
</table>

This visualization will be interactive, and we can do the following with it:
* Mouseover (node, weight): focusing
* Zooming and grab and move supported
* Reset: double click

The synapses are drawn with 2 colors, one for positive weight and one for negative weight. The absolute value of the synapses are scaled and transformed to thickness of line between to node.

In [22]:
factory.DrawNeuralNetwork(dataset, "MLP")

## Draw Deep Neural Network

The DrawNeuralNetwork function also can visualize deep neural networks, we just have to pass "DNN" as method name. If you have very big network with lots of thousands of neurons then drawing the network will be a little bit slow and will need a lot of ram, so be careful with this function.

This visualization also will be interactive, and we can do the following with it:

*   Zooming and grab and move supported

In [23]:
factory.DrawNeuralNetwork(dataset, "DNN")

## Draw Decision Tree

The trained decision trees will be save to XML save too, so we can read back the XML file and we can visualize the trees. This is the purpose of Factory.DrawDecisionTree function.

The arguments of this function:
<table>
<tr><th>Keyword</th><th>Can be used as positional argument</th><th>Default</th><th>Predefined values</th><th>Description</th></tr>
<tr>
    <td>datasetName</td>
    <td>yes, 1.</td>
    <td>-</td>
    <td>-</td>
    <td> The name of dataset</td>
</tr>
<tr>
    <td>methodName</td>
    <td>yes, 2.</td>
    <td>-</td>
    <td>-</td>
    <td> The name of method</td>
</tr>
</table>

This function will produce a little box where you can enter the index of the tree (the number of trees will be also will appear before this input box) you want to see. After choosing this number you have to press the Draw button. The nodes of tree will be colored, the color is associated to signal efficiency.

The visualization of tree will be interactive and you can do the following with it:

* Mouseover (node, weight): showing decision path
* Zooming and grab and move supported
* Reset zoomed tree: double click
* Expand all closed subtrees, turn off zoom: button in the bottom of the picture
* Click on node: 

    * hiding subtree, if node children are hidden the node will have a green border
    * rescaling: bigger nodes, bigger texts
    * click again to show the subtree

In [24]:
factory.DrawDecisionTree(dataset, "BDT") #11

## DNN weights heat map

In [25]:
factory.DrawDNNWeights(dataset, "DNN")

## Close the factory's output file

In [26]:
outputFile.Close()