Comparing Power Law with other Distributions
I'm using Aaron Clauset's powerlaw package to try fitting my data to a Power Law.
First, some details on my data:
 It is discrete (word count data);
 It is heavily skewed to the left (skewness is approx. 16)
 It is Leptokurtic (kurtosis is approx. 300)
What I have done so far
df_data is my Dataframe, where word_count is a Series containing word count data for around 1000 word tokens.
First I've generated a fit object:
fit = powerlaw.Fit(data=df_data.word_count, discrete=True, verbose=False, xmin=1, xmax=200)
Next, I compare the powerlaw distribution for my data against other distributions  namely, lognormal, exponential, lognormal_positive, stretched_exponential and truncated_powerlaw, with the fit.distribution_compare(distribution_one, distribution_two) method.
As a result of the distribution_compare method, I've obtained the following (r,p) tuples for each of the comparisons:
 fit.distribution_compare('power_law', 'lognormal') = (0.35617607052907196, 0.73466960075186816)
 fit.distribution_compare('power_law', 'exponential') = (481.35250943681206, 4.3450007097178692e05)
 fit.distribution_compare('power_law', 'lognormal_positive') = (89.186233734863649, 4.1315378698322223e08)
 fit.distribution_compare('power_law', 'stretched_exponential') = (1.7564708682020371, 0.2974294888802046)
 fit.distribution_compare('power_law', 'truncated_power_law') =(0.003684604382383605, 0.93159035254165268)
From the powerlaw documentation:
R : float
The loglikelihood ratio of the two sets of likelihoods. If positive, the first set of likelihoods is more likely (and so the probability distribution that produced them is a better fit to the data). If negative, the reverse is true.
p : float
The significance of the sign of R. If below a critical value (typically .05) the sign of R is taken to be significant. If above the critical value the sign of R is taken to be due to statistical fluctuations.
From the comparison results between powerlaw, exponential and lognormal distributions, I feel inclined to say that I have a powerlaw distribution.
Would this be a correct interpretation/assumption about the test results? Or perhaps I'm missing something?
1 answer

First off, while the methods might have been developed by me, Cosma Shalizi, and Mark Newman, our implementation is in Matlab and R. The python implementation I think you're using could be from Jeff Alstott or Javier del Molino Matamala or maybe Joel Ornstein (all of these are available off my website).
Now, about the results. A likelihood ratio test (LRT) does not allow you to conclude that you do or do not have a powerlaw distribution. It's only a model comparison tool, meaning it evaluates whether the power law is a less terrible fit to your data than some alternative. (I phrase it that way because an LRT is not a goodness of fit method.) Hence, even if the powerlaw distribution is favored over all the alternatives, it doesn't mean your data are powerlaw distributed. It only means that the powerlaw model is a less terrible statistical model of the data than the alternatives are.
To evaluate whether the powerlaw distribution itself is a statistically plausible model, you should compute the pvalue for the fitted powerlaw model, using the semiparametric bootstrap we describe in our paper. If p>0.1, and the powerlaw model is favored over the alternatives by the LRT, then you can conclude relatively strong support for your data following a powerlaw distribution.
Back to your specific results: each of your LRT comparisons produces a pair (r,p), where r is the normalized log likelihood ratio and p is the statistical significance of that ratio. The thing that is being tested for the pvalue here is whether the sign of r is meaningful. If p<0.05 for a LRT, then a positive sign indicates the powerlaw model is favored. Looking at your results, I see that the exponential and lognormal_positive alternatives are worse fits to the data than the powerlaw model. However, the lognormal, stretched_exponential, and truncated_power_law are not, meaning these alternatives are just as terrible fits to the data as your powerlaw model.
Without the pvalue from the hypothesis test for the powerlaw model itself, the LRT results are not fully interpretable. But even a partial interpretation is not consistent with a strong degree of evidence for a powerlaw pattern, since two nonpowerlaw models are just as good (bad) as the power law for these data. The fact that the exponential model is genuinely worse than the power law is not surprising considering how rightskewed your data are, so nothing to write home about there.