jonas-eschle on develop
docs: update CHANGELOG.rst for … (compare)
jonas-eschle on binned_new
chore: add benchmark in require… (compare)
jonas-eschle on binned_new
fix: unsupported format type (compare)
jonas-eschle on binned_new
fix: unsupported format type (compare)
jonas-eschle on binned_new
debug: add statement to check n… (compare)
jonas-eschle on binned_new
[pre-commit.ci] pre-commit auto… Merge pull request #368 from zf… Merge branch 'develop' into bin… (compare)
jonas-eschle on binned_new
docs: add docs for binned data enh: add chi2 docs and options … (compare)
Hi all. I'm wanting to save as much output pertaining to the fit result and the validity of the minimum found as possible. I see that I can access the (iminuit) information by doing something like:
minimum_info = dict(result.info["original"])
Where "result" is a FitResult instance.
Am I missing a more direct function implemented in zfit?
Indeed, this is most there is, as the common interface is shared by multiple minimizers and therefore does not provide all the detailed information that e.g. iminuit provides. But if you have a good idea on what to add, feel free to propose, that can indeed be helpful. What else you should currently find is:
result.edm
result.params_at_limit
result.valid
( which checks also params_at_limit)result.converged
result.fmin
(value of function at minimum)Just let us know (open an issue) if you think there are more things that should be propagated from the minimizer
Yes, in fact it is already upgraded in the develop version that will be a new "majorish" release, 0.6.0. If you want, you can install the current dev version withpip install git+https://github.com/zfit/zfit
Furthermore, there will be a general large upgrade on minimizers, adding SciPy and NLopt with a complete overhaul of the mechanics (currently a PR).
import zfit
from zfit import z
import numpy as np
import tensorflow as tf
zfit.run.set_autograd_mode(False)
class BinnedEfficiencyPDF(zfit.pdf.BasePDF):
def __init__(self, efficiency, eff_bins, obs, name='BinnedEfficiencyPDF'):
self.efficiency = efficiency
self.eff_bins = eff_bins
super().__init__(obs=obs, name=name)
def _bin_content(self, x):
eff_bin = np.digitize(x, self.eff_bins)
return self.efficiency[eff_bin]
def _unnormalized_pdf(self, x): # or even try with PDF
x = z.unstack_x(x)
probs = z.py_function(func=self._bin_content, inp=[x], Tout=tf.float64)
probs.set_shape(x.shape)
return probs
set_yield(...)
works
FFTConvPDFV1
) in zfit. How might I use FFTConvPDFV1
, to create a PDF equivalent to the Voigtian
in RooFit? I tried creating a new instance with the kernel set to a Gaussian and func
to the RBW from the zfit_physics
/ bw
branch, but it seems I cannot use this straightaway. Is there something I'm missing?
We've released the 0.6 series of zfit! Major addition is a lot of new minimizers that all support uncertainty estimations the same way as used now.
They can now be invoked independent of zfit models at all and used with pure Python functions
The main changes (full changelog here
Added many new minimizers. A full list can be found in :ref:minimize_user_api
.
IpyoptV1
that wraps the powerful Ipopt large scale minimization libraryScipyLBFGSBV1
, or ScipySLSQPV1
NLoptLBFGSV1
, NLoptTruncNewtonV1
orNLoptMMAV1
but also includes more global minimizers such asNLoptMLSLV1
and NLoptESCHV1
.Completely new and overhauled minimizers design, including:
init
to minimize
Major overhaul of the FitResult
, including:
zfit_error
(equivalent of MINOS
)minuit_hesse
and minuit_minos
are now available with all minimizers as well thanks to an greatapprox
hesse that returns the approximate hessian (if available, otherwise empty)Hey @mayou36! :)
I wrote to you regarding GSoC 2021 (via aman.goel185@gmail.com) , and have a doubt regarding the same in the evaluation task.
Can I contact you over private chat?
We released multiple small releases up to 0.6.3 with a few minor improvements and bugfixes. Make sure to upgrade to the latest version using
pip install -U zfit
Thanks to the finders of the bugs. We appreciate any kind of (informal) feedback, ideas or bugs, feel free to reach out to us anytime with anything
Hi. I am intensively using zfit in notebooks, and I have been running into the well-known NameAlreadyTakenErrors. I have found workarounds that work for me, but I just wanted to say that it seems to me the example presented in here does not work. Like, if you try to use this you will get an error when minimising:
~/.local/lib/python3.6/site-packages/zfit/minimizers/minimizer_minuit.py in <listcomp>(.0)
76 errors = tuple(param.step_size for param in params)
77 start_values = [p.numpy() for p in params]
---> 78 limits = [(low.numpy(), up.numpy()) for low, up in limits]
79 errors = [err.numpy() for err in errors]
80
AttributeError: 'int' object has no attribute 'numpy'
I guess the problem is that the limits are supposed to be tf.Tensor
, but if we simply assign a float
or int
via param.lower
or param.upper
that breaks the code later?
Couldn't there be some sort of a method, such as set_limit_lower(value)
? Or am I misusing zfit somehow?
Hi, to overload the parameters in Jupyter, does the following way work?
iCell = get_ipython().execution_count #get the current cell number
par1 = zfit.Parameter("par1"+str(iCell), 8., 0., 20.)
par2 = zfit.Parameter("par2"+str(iCell), -20., -50., 50.)
par3 = zfit.Parameter("par3_"+str(iCell), 10., 0., 20.)
Hi again :) I was wondering if there is a way to have a parameter which has its upper limit depending on another parameter. As a naive illustration, imagine you are fitting a quadratic parabola ax**2+bx+c, and you want the peak of the parabola to be between 0 and 5. The would mean 0<-b/2a<5.
I naively tried something similar to this:
a = zfit.Parameter("a", 5, floating=True)
b = zfit.Parameter("b", 0, lower = -10*a, floating=True)
However, it seems that all this does is sets the upper limit to be at -10 * initial value of a. Is there a way to somehow change limit to the value as a is changing?
A new zfit version, 0.8x series is available, with bugfixes, improved numerical integration and different Kernel Density Estimations (also for large sample sizes).
The tutorials also improved in the style and have now their own site. They can be run interactively, or downloaded, or be viewed.
Hi experts.
I had a question regarding errors. I have made a quick example to highlight my question, hopefully it makes sense.
I have noticed that if I fit my pdf, and then refit that same pdf again and again in a loop, the error I get out is not the same but keeps changing. Also the errors calculated using different methods are not consistent, at least not always, even after the initial fit (e.g. 0th iteration).
I have to say I don't quite understand this. I have to say I am not an expert in fitting, so maybe this is expected, but I find it very weird.
I'll try to illustrate this with an example:
from zfit.pdf import Gauss, Exponential
obs = zfit.Space("x", limits=(-5, 5))
minimizer = zfit.minimize.Minuit(use_minuit_grad=True)
mu = zfit.Parameter("muu", 0, step_size=0.01)
sigma = zfit.Parameter("sigma", 1,step_size=0.01)
gauss = zfit.pdf.Gauss(mu=mu, sigma=sigma, obs=obs)
gauss_yield = zfit.Parameter("g_yield", 100, step_size=0.1)
gauss_ext = gauss.create_extended(gauss_yield)
lam = zfit.Parameter("lam", -1,step_size=0.01)
expo = zfit.pdf.Exponential(lam=lam, obs=obs)
expo_yield = zfit.Parameter("e_yield", 500, step_size=0.1)
expo_ext = expo.create_extended(expo_yield)
gauss_expo = zfit.pdf.SumPDF([expo_ext, gauss_ext])
random_gauss = np.random.normal(size=500)+1
random_exp = np.random.exponential(scale = 5, size=1000)-5
random_data = np.append(random_gauss, random_exp)
# then for each different error method I run this:
lam.set_value(-1)
mu.set_value(1)
sigma.set_value(1)
frac.set_value(0.5)
expo_yield.set_value(100)
gauss_yield.set_value(100)
data = zfit.Data.from_numpy(obs=obs, array=random_data)
nll = zfit.loss.UnbinnedNLL(model=gauss_expo, data=data)
iterations = np.arange(0,20)
yield_error = []
for it in iterations:
result = minimizer.minimize(nll)
result.errors() # here I also try result.hesse() with #method='hesse_np', 'approx', 'minuit_hesse'
yield_error.append(result.params[gauss_yield]['minuit_minos']['upper'])
Each of these iterations produces a different error, particularly when using minuit_hesse
and hesse_np
:
#minuit_minos upper (lower is almost the same with a negative sign)
array([118.3930117 , 118.13616212, 118.17518686, 118.1589878 ,
118.17934101, 118.17029047, 118.18437411, 118.17827166,
118.18786236, 118.1837249 , 118.19027599, 118.1874818 ,
118.19182178, 118.1899428 , 118.19290011, 118.19164637,
118.19360303, 118.19276492, 118.19412068, 118.19355127])
#minuit_hesse
array([264.56868112, 263.99099257, 264.03559365, 263.99005815,
264.0375983 , 264.01347448, 590.9170118 , 264.02200987,
590.84657398, 264.03923864, 592.87290404, 264.05010254,
589.78390794, 426.94055222, 588.76391966, 585.73316667,
591.81967819, 587.7401319 , 591.81833332, 592.84125751])
#hesse_np
array([205.6559601 , 207.41405669, nan, nan,
nan, 737.84227104, nan, 70.71791776,
314.54393221, nan, 110.97088789, 237.15737011,
nan, nan, nan, nan,
nan, 149.10569392, 300.7499339 , nan])
#approx
array([118.39301118, 118.1361616 , 118.17518634, 118.15898728,
118.17934049, 118.17028995, 118.18437359, 118.17827114,
118.18786184, 118.18372437, 118.19027547, 118.18748128,
118.19182126, 118.18994228, 118.19289958, 118.19164585,
118.19360251, 118.1927644 , 118.19412016, 118.19355074])
I understand that I am generally speaking not supposed to loop an already converged fit again, but what is puzzling me that even if I only look at the very first element in each of these lists they are not at all consistent. I noticed this in a more complicated fit that I am doing in an analysis and I am a bit puzzled. I prepared this simple mock example to make it easier to reproduce.
Is this expected? Am I doing something crazy here?
Sorry for the long question, and thanks a lot.
Hi, first of all, thanks a lot for bringing it up and making such a good reproducible examlpe. You are also welcome to opend an issue. The problem is that you create an unbinned likelihood (=> create an ExtendedUnbinnedNLL
instead, this works for me), so you are not constraining the sum of yields to be the (poisson distributied) number of events. There should be a warning displayed like
AdvancedFeatureWarning: Either you're using an advanced feature OR causing unwanted behavior. To turn this warning off, use `zfit.settings.advanced_warnings['extended_in_UnbinnedNLL']` = False` or 'all' (use with care) with `zfit.settings.advanced_warnings['all'] = False
Extended PDFs are given to a normal UnbinnedNLL. This won't take the yield into account and simply treat the PDFs as non-extended PDFs. To create an extended NLL, use the `ExtendedUnbinnedNLL`.
warn_advanced_feature("Extended PDFs are given to a normal UnbinnedNLL. This won't take the yield "
So the fit you are doing is equal to defining a sum of two pdfs using two free parameters: we end up with a degree of freedom too much. This is what causes the error to vary each time (at least I suspect it).
To explain the errors: 'minuit_minos' is the builtin minuit error (from iminuit, the minos method). minuit_hesse
is the hesse algorithm of iminuit. approx is the minimizers approximation of the hesse (and is maybe not available or completely off. It's just a "better than nothing", but often accurate enough for some usecases such as getting the order of magnitude). hesse_np
is zfits implementation of Hesse and the NaNs are probably pretty accurate: it can't determine the hession because it fails for the good reason that it's an underconstraint problem.
Just to mention, one method you didn't try is zfit_error
, zfits own implementation of "minos". In my test it gives a comparable error (42 vs 39 from minos) using the ExtendedUnbinnedNLL
UnbinnedNLL
seemed to be the culprit. I still need to investigate why the fit where I initially spotted this was misbehaving as there I was using the correct ExtendedUnbinnedNLL
. But your answer proposes some hints so I will try them. It also brings a bit more clarity about zift overall, thanks :)