I think there are a bunch of conflated issues here.
local variable 'best_clf' referenced before assignment". I updated the
particular case because of the pickling error.
start, even without parallelization. The last change was 6 years ago, and
I'm afraid that things changed in between and the code wasn't updated. I
Post by marco tettamantiDear Matteo,
thank you for kindly replying!
Yes, I do have the latest versions of joblib (0.11) and sklearn (0.19.1),
see at the bottom of this email.
The problem seems independent of either running in jupyter, or evoking
ipython or python directly in the console.
I am now wondering whether there may be something wrong in my snippet.
UnboundLocalError: local variable 'best_clf' referenced before
#best_clfs = {} #moved down 7 lines
confusion = ConfusionMatrix()
verbose(1, "Estimating error using nested CV for model selection")
partitioner = partitionerCD
splitter = Splitter('partitions')
tms = Parallel(n_jobs=2)(delayed(_run_one_partition)(isplit, partitions)
for isplit, partitions in enumerate(partitionerCD.
best_clfs = {}
confusion += tm.ca.stats
best_clfs[tm.measure.descr] = best_clfs.get(tm.measure.descr, 0) + 1
But now, running the snippet in ipython/python specifically for the SVM
UnboundLocalError: local variable 'best_clf' referenced before
May this be the culprit? As a reminder, the full snippet I am using is
included in my previous email.
Thank you and very best wishes,
Hi Marco,
some ideas in random order
- what version of sklearn/joblib are you using? I would make sure to use
the latest version (0.11), perhaps not importing it from sklearn (unless
you have the latest sklearn version, 0.19.1)
- are you running the code in a jupyter notebook? There might be some
issues with that (see https://github.com/joblib/joblib/issues/174). As a
test you might try to convert your notebook to a script and then run it
Dear Matteo (and others),
sorry, I am again asking for your help!
I have experimented with the analysis of my dataset using an adaptation of
your joblib-based gist.
As I wrote before, it works perfectly, but not with some classifiers: SVM
classifiers always cause the code to terminate with an error.
myclassif=clfswh['!gnpp','!skl','svm'] #Note that 'gnnp' and
'skl' were excluded for independent reasons
the code runs through without errors.
'[TransferMeasure(measure=SVM(svm_impl='C_SVC', kernel=LinearLSKernel(),
weight=[], probability=1,
weight_label=[]), splitter=Splitter(space='partitions'),
pickle SwigPyObject objects",)'
After googling for what may cause this particular error, I have found that
the situation improves slightly (i.e. more splits executed, sometimes even
import os
from sklearn.externals.joblib import Parallel, delayed
from sklearn.externals.joblib.parallel import parallel_backend
However, also in this case, the code invariably terminates with a long
error message (I only report an extract, but in case I can send the whole
<type 'str'>: (<type 'exceptions.UnicodeEncodeError'>,
...........................\n/usr/lib/python2.7/runpy.py in
alter_argv=1)\n 169 pkg_name = mod_name.rpartition(\'.\')[0]\n
main_globals = sys.modules["__main__"].__dict__\n 171 if
alter_argv:\n 172 sys.argv[0] = fname\n 173 return
main_globals, None,\n--> 174
I think I have sort of understood that the problem is due to some failure
in pickling the parallelized jobs, but I have no clues if and how it can be
Do you have any suggestions?
Thank you and very best wishes,
Summary for targets across chunks
targets mean std min max #chunks
C 0.5 0.5 0 1 18
D 0.5 0.5 0 1 18
#Evaluate prevalent best classifier with nested crossvalidation
verbose.level = 5
partitionerCD = ChainNode([NFoldPartitioner(cvtype=2, attr='chunks'),
Sifter([('partitions', 2), ('targets', ['C', 'D'])])], space='partitions')
# training partitions
training = fds[fds_.sa.partitions == 1]
#print list(zip(training.sa.chunks, training.sa.targets))
# testing partitions
testing = fds[fds_.sa.partitions == 2]
#print list(zip(testing.sa.chunks, testing.sa.targets))
"""Select best model according to CVTE
Helper function which we will use twice -- once for proper nested
cross-validation, and once to see how big an optimistic bias due
to model selection could be if we simply provide an entire dataset.
dstrain_ : Dataset
clfs : list of Classifiers
Which classifiers to explore
best_clf, best_error
best_error = None
cv = CrossValidation(clf, partitionerCD)
# unfortunately we don't have ability to reassign clf atm
# cv.transerror.clf = clf
error = np.mean(cv(dstrain_))
# skip the classifier if data was not appropriate and it
# failed to learn/predict at all
best_clf = clf
best_error = error
verbose(4, "Classifier %s cv error=%.2f" % (clf.descr, error))
verbose(3, "Selected the best out of %i classifiers %s with error %.2f"
% (len(clfs), best_clf.descr, best_error))
return best_clf, best_error
# This function will run all classifiers for one single partitions
myclassif=clfswh['!gnpp','!skl'][5:6] #Testing a single SVM classifier
def _run_one_partition(isplit, partitions, classifiers=myclassif): #see §§
verbose(2, "Processing split #%i" % isplit)
dstrain, dstest = list(splitter.generate(partitions))
best_clf, best_error = select_best_clf(dstrain, classifiers)
# now that we have the best classifier, lets assess its transfer
# to the testing dataset while training on entire training
tm = TransferMeasure(best_clf, splitter,postproc=
BinaryFxNode(mean_mismatch_error,space='targets'), enable_ca=['stats'])
return tm
#import os
#from sklearn.externals.joblib import Parallel, delayed
#from sklearn.externals.joblib.parallel import parallel_backend
# Parallel estimate error using nested CV for model selection
confusion = ConfusionMatrix()
verbose(1, "Estimating error using nested CV for model selection")
partitioner = partitionerCD
splitter = Splitter('partitions')
# Here we are using joblib Parallel to parallelize each partition
# Set n_jobs to the number of available cores (or how many you want to use)
# tms = Parallel(n_jobs=2)(delayed(_run_one_partition)(isplit,
tms = Parallel(n_jobs=2)(delayed(_run_one_partition)(isplit, partitions)
for isplit, partitions in enumerate(partitionerCD.
# Parallel retuns a list with the results of each parallel loop, so we need to
# unravel it to get the confusion matrix
best_clfs = {}
confusion += tm.ca.stats
best_clfs[tm.measure.descr] = best_clfs.get(tm.measure.descr, 0) + 1
########## * ##########
Dear Matteo,
grazie mille, this is precisely the kind of thing I was looking for: it
works like charm!
Hi Marco,
in your case, I would then recommend looking into joblib to parallelize
your for loops (https://pythonhosted.org/joblib/parallel.html).
As an example, here's a gist containing part of the PyMVPA's nested_cv
example where I parallelized the loop across partitions. I feel this is
what you might want to do in your case, since you have a lot more folds.
Dear Matteo,
thank you for the willingness to look into my code.
This is taken almost verbatim from http://dev.pymvpa.org/
examples/nested_cv.html, except for the leave-one-pair-out partitioning,
and a slight reduction in the number of classifiers (in the original
example, they are around 45).
Any help or suggestion would be greatly appreciated!
All the best,
#Evaluate prevalent best classifier with nested crossvalidation
verbose.level = 5
partitionerCD = ChainNode([NFoldPartitioner(cvtype=2, attr='chunks'),
Sifter([('partitions', 2), ('targets', ['C', 'D'])])], space='partitions')
# training partitions
training = fds[fds_.sa.partitions == 1]
#print list(zip(training.sa.chunks, training.sa.targets))
# testing partitions
testing = fds[fds_.sa.partitions == 2]
#print list(zip(testing.sa.chunks, testing.sa.targets))
"""Select best model according to CVTE
Helper function which we will use twice -- once for proper nested
cross-validation, and once to see how big an optimistic bias due
to model selection could be if we simply provide an entire dataset.
dstrain_ : Dataset
clfs : list of Classifiers
Which classifiers to explore
best_clf, best_error
best_error = None
cv = CrossValidation(clf, partitionerCD)
# unfortunately we don't have ability to reassign clf atm
# cv.transerror.clf = clf
error = np.mean(cv(dstrain_))
# skip the classifier if data was not appropriate and it
# failed to learn/predict at all
best_clf = clf
best_error = error
verbose(4, "Classifier %s cv error=%.2f" % (clf.descr, error))
verbose(3, "Selected the best out of %i classifiers %s with error %.2f"
% (len(clfs), best_clf.descr, best_error))
return best_clf, best_error
best_clfs = {}
confusion = ConfusionMatrix()
verbose(1, "Estimating error using nested CV for model selection")
partitioner = partitionerCD
splitter = Splitter('partitions')
verbose(2, "Processing split #%i" % isplit)
dstrain, dstest = list(splitter.generate(partitions))
best_clf, best_error = select_best_clf(dstrain, clfswh['!gnpp','!skl'])
best_clfs[best_clf.descr] = best_clfs.get(best_clf.descr, 0) + 1
# now that we have the best classifier, lets assess its transfer
# to the testing dataset while training on entire training
tm = TransferMeasure(best_clf, splitter,
space='targets'), enable_ca=['stats'])
confusion += tm.ca.stats
########## * ##########
What do you mean with "cycling over approx 40 different classifiers"? Are
you testing different classifiers? If that's the case, a possibility is to
create a script that takes as argument the type of classifiers and runs the
classification across all folds. In that way you can submit 40 jobs and
parallelize across classifiers.
If that's not the case, because the folds are independent and deterministic
I would create a script that performs the classification on blocks of folds
(say fold 1 to 30, 31, to 60, etc...), and then submit different jobs, so
to parallelize there.
I think that if you send a snippet of the code you're using it can be more
evident which are good points for parallelization.
Dear Matteo and Nick,
thank you for your responses.
I take the occasion to ask some follow-up questions, because I am struggling to
make pymvpa2 computations faster and more efficient.
I often find myself in the situation of giving up with a particular analysis,
because it is going to take far more time that I can bear (weeks, months!). This
happens particularly with searchlight permutation testing (gnbsearchlight is
much faster, but does not support pprocess), and nested cross-validation.
As for the latter, for example, I recently wanted to run nested cross-validation
in a sample of 18 patients and 18 controls (1 image x subject), training the
classifiers to discriminate patients from controls in a leave-one-pair-out
partitioning scheme. This yields 18*18=324 folds. For a small ROI of 36 voxels,
cycling over approx 40 different classifiers takes about 2 hours for each fold
on a decent PowerEdge T430 Dell server with 128GB RAM. This means approx. 27
days for all 324 folds!
The same server is equipped with 32 CPUs. With full parallelization, the same
analysis may be completed in less than one day. This is the reason of my
interest and questions about parallelization.
Is there anything that you experts do in such situations to speed up or make the
computation more efficient?
Thank you again and best wishes,
There have been some plans / minor attempts for using parallelisation more
parallel, but as far as I know we only support pprocces, and only for (1)
searchlight; (2) surface-based voxel selection; and (3) hyperalignment. I
do remember that parallelisation of other functions was challenging due to
some getting the conditional attributes set right, but this is long time
Hi Marco,
AFAIK, there is no support for parallelization at the level of
cross-validation. Usually for a small ROI (such a searchlight) and with
standard CV schemes, the process is quite fast, and the bottleneck is
really the number of searchlights to be computed (for which parallelization
In my experience, we tend to parallelize at the level of individual
participants; for example we might set up a searchlight analysis with
however n_procs you can have, and then submit one such job for every
participant to a cluster (using either torque or condor).
Dear all,
forgive me if this has already been asked in the past, but I was wondering
whether there has been any development meanwhile.
Are there any chances that one can generally apply parallel computing (multiple
CPUs or clusters) with pymvpa2, in addition to what is already implemented for
searchlight (pprocess)? That is, also for general cross-validation, nested
cross-validation, permutation testing, RFE, etc.?
Has anyone had succesful experience with parallelization schemes such as
ipyparallel, condor or else?
Thank you and best wishes!
