Index of Unfairness

Objective. Objective scientific knowledge for many authors more valuable than true subjective belief is determined by research on primary data but a renewed analysis of already recorded or published data is common too. Ever since, an appropriate experimental or study design is an important and often a seriously underappreciated aspect of the informativeness and the scientific value of any (medical) study. The significance of study design for the reliability of the conclusions drawn and the ability to generalize the results from the sample investigated for the whole population cannot be underestimated. In contrast to an inappropriate statistical evaluation of a medical study, it is difficult to correct errors in study design after the study has been completed. Various mathematical aspects of study design are discussed in this article. Methods. In assessing the significance of a fair study design of a medical study, important measures of publication bias are introduced. Methods of data or publication bias analysis in different types of studies are illustrated through examples with fictive data. Formal mathematical requirements of a fair study design which can and should be fulfilled carefully with regard to the planning or evaluation of medical research are developed. Results. Various especially mathematical aspects of a fair study design are discussed in this article in detail. Depending on the particular question being asked, mathematical methods are developed which allow us to recognize data which are self-contradictory and to exclude these data from systematic literature reviews and metaanalyses. As a result, different individual studies can be summed up and evaluated with a higher degree of certainty. Conclusions. This article is intended to give the reader guidance in evaluating the design of studies in medical research even ex post which should enable the reader to categorize medical studies better and to assess their scientific quality more accurately.


Introduction
Biostatistics or statistical analysis is based on the key idea that the observation of a sample of subjects which is drawn from a certain population can be used to arrive at meaningful conclusions or inferences about the population with high degree of accuracy.In biomedical research, the various aspects of clinical research and the credibility of the data from a study substantially depend on the study design (Grimes and Schulz, 2002) which is more important than analyzing its results.The study design should ensure that a null hypothesis is either rejected or accepted and the conclusions drawn reflect only the truth.In particular, a poorly analyzed study can be reanalyzed but a poorly designed study can recover only poorly.Especially a mal-designed study (inclusion and exclusion criteria and other factors) can have an impact on the quality of the study sample with the consequence that the same is not an appropriate representative of the population.Under such circumstances, other studies may fail successfully to replicate the results of an original study and inferences drawn could be misleading while the statistical procedures used cannot help any more.The widespread and documented lack of completeness and transparency in the reporting of statistical methods used endangers the possibility that a new study carried out again can successfully reproduce sufficiently similar or the same results as the original study.In point of fact, more than half (52%) of scientists surveyed believe that studies do not successfully reproduce sufficiently similar or the same results as the original studies (Baker, 2016).A careful re-evaluation of the statistical methods and other scientific means which underpin scientific inquiry and research goals appears to be necessary.While it is important to recognize the shortcoming of today's science, one issue which has shaped debates over studies published is the question: has the study measured what it set out to?Even if studies carried out can vary greatly in detail the data from the studies itself provide information about the credibility of the data.Published by IDEAS SPREAD

Material and Methods
Systematic observation and experimentation, inductive and deductive reasoning are essential for any formation and testing of hypotheses and theories about the natural world.In one way or another, logically and mathematically sound scientific methods and concepts are crucial constituents of any scientific progress.When all goes well, different scientists at different times and places using the same scientific methodology should be able to generate the same scientific knowledge.

Definitions
Definition 2.1.1.(The sample space) Let the sample space denote a set or a collection of all different possible outcomes of an experiment.Each possible single outcome x t of the experiment is said to be a member of the sample space, or to belong to the space S(X).A single outcome x t of an experiment S(X) is a member of S(X) and denoted symbolically by the relation x t ∈ S(X).A set Y is contained in another set X if every element of the set Y also belongs to the set X.This relation is expressed symbolically by the expression Y ⊂ X, which is the set-theoretic expression for saying that Y is a subset of X.A subset of X that contains no elements is called an empty set, or null set, and it is denoted by the symbol ∅.In a given experiment, a number p(x t ) is assigned to each event x t in the sample space S which indicates the probability that x t will occur.If the event x t is certain to occur, then the probability of that event is p(x t )=1.

Definition 2.1.2. (Independence)
Let A t denote random variable at a Bernoulli trial (period of time) t.Let B t denote another random variable at the same Bernoulli trial (period of time) t.Let p(A t ) denote the probability of A t .Let p(B t ) denote the probability of B t .Let p(A t ∩ B t ) denote the joint probability of A t and B t .In the case of independence (de Moivre, 1718;Kolmogoroff, 1933)   The index of unfairness (IOU) is defined as The range of A is 0 < A < n, while the range of B is 0 < B < n.A study design based on A=B=0 leads to an index of unfairness of IOU = (((0+0)/n)-1) = -1.A study design which demands that A=B=n leads to an index of unfairness of IOU = (((n+n)/n)-1) = +1.In particular, the range of the index of unfairness is The definition of the sufficient condition relationship (Barukčić, 1989;Barukčić, 1997;Barukčić, 2005;Barukčić, 2006;Barukčić, 2009;Barukčić, 2011;Barukčić, 2012;Barukčić, 2016;Barukčić, 2017;Barukčić, 2018;Barukčić, 2019), the necessary condition and the exclusion relationship can be found in literature (Barukčić, 2019).The concept of necessary and sufficient conditions (Barukčić, 1989;Barukčić, 1997;Barukčić, 2005;Barukčić, 2006;Barukčić, 2009;Barukčić, 2011;Barukčić, 2012;Barukčić, 2016;Barukčić, 2017;Barukčić, 2018), like other fundamental concepts, is determined by its own parts too, the necessary conditions and the sufficient conditions, which are under some circumstance's converses of each other.An event A t which is a necessary and sufficient condition of another event B t , is more than just a necessary condition of an event B t .The same event A t is equally more than just a sufficient condition, sometimes referred to as material implication, of the same event B t .Such an event A t is at the same Bernoulli trial t, both, a sufficient and a necessary condition of an event B t .The account of necessary and sufficient conditions just outlined before is in contrast to the well-known and premature insight of J. L. Mackie that causes are at least INUS conditions, that is, "the so-called cause is, and is known to be, an insufficient but necessary part of a condition which is itself unnecessary but sufficient for the result" (Mackie 1965).In a slightly different way, besides of Mackie's premature generalization and undeniably an oversimplification of the necessary and sufficient condition relationship, how then, can such a necessary and sufficient condition be mathematized?In this respect, let an event A t with its own probability p(A t ) at the same (period of) time t be a necessary and sufficient condition for another event B t with its own probability p(B t ).In other words, without A t no B t or the absence of A t guarantees the absence of B t and in the same respect if A t is given then B t is given too.The mathematical formula of the necessary and sufficient condition relationship of a population is defined as Published by IDEAS SPREAD ( 1 3 ) Definition 2.1.6.(Either A t or B t relationship) Among the many generally valid natural laws and principles under which nature or matter itself assures its own self-organization, a relationship between events denoted as a necessary (Barukčić, 1989;Barukčić, 1997;Barukčić, 2005;Barukčić, 2006;Barukčić, 2009;Barukčić, 2011;Barukčić, 2012;Barukčić, 2016;Barukčić, 2017;Barukčić, 2018) condition (a conditio sine qua non) is one among the most important and is discussed in literature.A necessary event or condition A t for some event B t is a condition that must be satisfied in order to obtain B t .In this respect, to say that an event A t with its own probability p(A t ) is at the same (period of) time t a necessary condition for another event B t with its own probability p(B t ) is equivalent to say that it is impossible to have B t without A t .
In other words, without A t no B t or the absence of A t guarantees the absence of B t .In contrast to this, the mathematical formula of the either A t or B t relationship of a population is defined as ( 1 4 ) Definition 2.1.7.(The Chi-square goodness-of fit test) A Chi-Square goodness-of fit test is one of commonly used methods of statistical inference and was originally proposed by Karl Pearson (Pearson, 1900).Given some conditions (simple random sampling, categorical random variable, expected value of the number of sample observations is at least 5 et cetera), the chi-square goodness of fit test can be applied to determine whether (sample distribution) data observed are consistent with (theoretical distribution) hypothesized data.The degrees of freedom (d.f.) of a chi-square goodness of fit test is equal to the number of levels (k) of the categorical variable minus 1.In general, the chi-square goodness of fit test is given by Suppose, a coin, assumed to be fair, is tossed 100 times with the results given in Table 3.In this context, the chi-square goodness of fit test (Sachs, 1992, p. 421) requires to state a null hypothesis (H 0 ) and an alternative hypothesis (H A ) too.In point of fact, it is p=p(Heads) and q=p(Tails) and (p +q) = 1 or (p(Heads) + p(Tails)) = 1 or p(Tails) = 1 -p(Heads).In our present case (α = 0.05), for a chi-square goodness of fit test of this example, the hypotheses take the following form.

Null hypothesis:
The data are consistent with a specified distribution or p(Heads) =0.5 The null hypothesis claims equally that p(Heads) = 1 -p(Tails) = 0.5 Alternative hypothesis: The data are not consistent with a specified distribution.
The Null hypothesis is not true.
The value of the test statistics as calculated before is with d. f. = k-1=2-1 = 1.Unfortunately, the p-value of X 2 =4 is less than the significance level (0.05).We accept the alternative hypothesis and reject the null-hypothesis.The sample data do not provide support for the hypothesis that the coin tossed is fair.In general, it is not necessary that p = q, to be able use the chi square goodness-of fit test which is the mathematical the foundation of the chi square goodness of fit test of the necessary condition, of a sufficient condition et cetera with d. f. = k-1=2-1 = 1.A random sample of observations can come from a particular distribution (sufficient condition distribution) but must not.The X² test of goodness-of-fit is an appropriate method for testing the null hypothesis that a random sample of observations comes from a specific distribution (i.e. the distribution of a sufficient condition) against the alternative hypothesis that the data have some other distribution.The additive property of X² distribution may sometimes be used as an additional test of significance.In this case, the continuity correction should be omitted from each X² value.Under conditions where the chi-square goodness of fit test cannot be used it is possible to use an approximate and conservative (one sided) confidence interval known as the rule of three.The X² distribution is a particular type of a gamma distribution and widely applied in the field of mathematical statistics.The applicability of using the Pearson chi-squared statistic in cases where the cell frequencies of a 2× 2 contingency table are not greater than five is widely discussed (Fisher, 1922) in literature and the use of Yate's continuity correction (Yates, 1934) is proposed.However, studies provided evidence that incorporating Yate's continuity correction is not essential (Grizzle, 1967;Conover, 1974).Still, using the continuity correction (Yates, 1934), the chi-square value of a conditio per quam relationship is derived (Barukčić, 2018) as or alternatively as Under conditions where the chi-square goodness of fit test cannot be used it is possible that an approximate and conservative (one sided) confidence interval known as the rule of three is of use.Using the continuity correction, Published by IDEAS SPREAD the chi-square value of a conditio sine qua non distribution (Barukčić, 1989;Barukčić, 1997;Barukčić, 2005;Barukčić, 2006;Barukčić, 2009;Barukčić, 2011;Barukčić, 2012;Barukčić, 2016;Barukčić, 2017;Barukčić, 2018) before changes to Depending upon the study design, another method to calculate the chi-square value of a conditio sine qua non distribution (while using the continuity correction) is defined as The chi square value with degree of freedom 2-1=1of the exclusion relationship (Barukčić, 1989;Barukčić, 1997;Barukčić, 2005;Barukčić, 2006;Barukčić, 2009;Barukčić, 2011;Barukčić, 2012;Barukčić, 2016;Barukčić, 2017;Barukčić, 2018) with a continuity correction can be calculated as Depending upon the study design, another method to calculate the chi-square value of the exclusion relationship is defined as The chi square Goodness of Fit Test of the exclusion relationship examines how well observed data compare with the expected theoretical distribution of an exclusion relationship.Definition 2.1.11.(The logarithmic correction of the Chi-square goodness-of fit test) The logarithmic correction of the chi-square goodness-of-fit test is defined as with degrees of freedom d. f. = k-1, which is of use for Big Data applications.Definition 2.1.12.(The Mathematical Formula of the Causal Relationship k) The mathematical formula of the causal relationship k (Barukčić, 1989;Barukčić, 1997;Barukčić, 2005;Barukčić, 2006;Barukčić, 2009;Barukčić, 2011;Barukčić, 2012;Barukčić, 2016;Barukčić, 2017;Barukčić, 2018) is defined at every single event, at every single Bernoulli trial t, as where A t denotes the cause and B t denotes the effect.Under some certain circumstances, the chi-square distribution can be applied to determine the significance of causal relationship k.Pearson's concept of correlation is not identical with causation.Causation as such is not identical with correlation.This has been proved many times and is widely discussed in many publications.
Definition 2.1.13.(The Random Variables and Distributions) Let X denote a real-valued function defined on a sample space, a random variable, with a finite number of finite outcomes x 1 occurring with probability p(X = x 1 ), x 2 occurring with probability p(X = x 2 ), …, x n occurring with probability p(X = x n ).The collection of all of these probabilities denotes the distribution of the discrete or continuous random variable X.A discrete distribution is characterized by its probability mass function (p.m. f.).
A continuous distribution is characterized by its probability density function (p.d. f.).Let E(x t ) denote the expectation value of a single event x t .Let E(x t 2 ) denote the second moment expectation value of a single event x t .
In general, it is ( 2 6 ) while ψ(x t ) denotes the wave function of the random variable x t and ψ*(x t ) denotes the complex conjugate of the wave function.Let σ(x t ) 2 denote the variance of a single event.Then Let E(X) denote the expectation value of the random variable X.It is then and Published by IDEAS SPREAD while Ψ(X) denotes the wave function of the random variable X and Ψ * (X) denotes the complex conjugate of the wave function.Under conditions, where X×Ψ * (X)= 1 (Barukčić, 2016) it is but not in general.Let E(X 2 ) denote the expectation value of the second moment of a random variable X.Let σ(x) 2 denote the variance of the random variable X.Then Historically, the binomial probability mass function of observing exactly x successes in n trials, with the probability of success on a single trial denoted by p and q = 1-p is defined as and was derived by the prominent Suisse mathematician Jacob Bernoulli (1655Bernoulli ( -1705) ) in his work Ars Conjectandi (Bernoulli, 1713).The mathematical formula to find probabilities in the binomial distributions may by very simple but to do the calculation itself can be pretty troublesome.A binomial distribution with parameters p and n = 1 is called the Bernoulli distribution with parameter p while x can take the values either +0 or +1.It is Under conditions where X = n, the binomial distributions simplifies to Under conditions where X = 0, the binomial distributions simplifies to Under certain circumstances, the Poisson distribution is a useful approximation to the binomial distribution with a very small success probability especially when the value of n is large and the value of p is close to 0.

Definition 2.1.15. (The Poisson distribution)
The Poisson distribution, given previously by Abraham de Moivre (Moivre, 1733), is ascribed to Siméon Denis Poisson (1781Poisson ( -1840)), a French mathematician, physicist, and engineer who published the same distribution 1837 in his work "Recherches sur la probabilité des jugements en matière criminelle et en matière civile" (Poisson, 1837).Ladislaus Bortkiewicz (Bortkiewicz, 1898) provided in 1898 one of the first practical applications of Poisson's distribution while investigating the number of soldiers in the Prussian army killed accidentally by horse kicks.A discrete random variable X is said to have a Poisson distribution with parameter λ > 0, if, for x = 0, 1, 2, ..., the probability mass function of X is given by Published by IDEAS SPREAD were x is the number of times an event occurs in an interval and x can take values 0, 1, 2, …., e is Euler's number (the number 2.71828..., the base of the natural logarithms) and x! is the factorial of x or x!=(x)×(x-1)×(x-2)×...2×1 and λ = n × p is the mean of the Poisson distribution.Many times, the Poisson distribution is applied to experimental conditions or situations with a large number of trials n while the occurrence of each event is very rare.Under conditions were x=0, it is Suppose that the probability of a conditio sine qua non relationship p(A t ←B t ) ≈ +1.The probability the a conditio sine qua non relationship will not be given will be p(A t ←B t ) = 1-p(A t ←B t ) ≈ +0.The expectation value that there is no conditio sine qua non relationship is λ = n × (1-p(A t ←B t )).The probability follows as The probability of counting at least one rare Poisson event is 1 minus the probability of counting none, which is as Definition 2.1.16.(The Chebyshev inequality) Let X be a random variable with finite expected value E(x) and finite non-zero variance σ(x)².Then for any real number x > 0, the probability p(x) for each real number x calculated according to the Chebyshev's inequality (Bienaymé, 1846;Tchébychef, 1867) follows as The Chebyshev's inequality (also called the Bienaymé-Chebyshev inequality) provide only very approximate values.

Results
Theorem 3.1.(Distributions and anti distributions) Suppose that S defines the sample space of an experiment completely.Let a real-valued function (a random variable) X which is defined on the sample space S assign a real number X(s) to each possible outcome s ∈ S in a particular experiment.The distribution of the random variable of X is defined as the collection of all probabilities p(X ∈ A) for all subsets A of the real numbers.A discrete random variable is defined as a random variable X which can take only a finite number of k different values x 1 , …, x k or at most, an infinite sequence of x 1 , x 2 , … The distribution of a discrete random variable X is defined as the probability mass function and abbreviated as p(x) or p. m. f.(x) of X, namely p(x) = p.m. f.(x) = p(X = x) for all x in the set of possible values.A random variable X which can take every value in an interval is called a continuous random variable.A continuous distribution is defined by its own probability density function (p.d.f.) of the distribution of X for every interval (a,b) as Continuous random variables satisfy the condition p(X=x)=0.In practical problems it may sometimes be necessary to consider a distribution as a mixture of a continuous distribution and a discrete distribution.Again, the cumulative distribution function abbreviated as P(x) or as F(x) or as d.f.(x) or c.d.f.(x) of every random variable X, regardless of whether the distribution of X is continuous, discrete or mixed, for each real number x is defined as Published by IDEAS SPREAD for -∞ < x < +∞.

Claim.
For every value x, the anti distribution of x, denoted as p(X ≠ x), is determined as Proof.
For every value x, regardless of whether the distribution of X is continuous, discrete or mixed, it is Since p(X<x) = p(X=x) + p(X<x), the equation before can be rearranged as for -∞ < x < +∞.Rearranging again, we obtain for -∞ < x < +∞.We define the anti-distribution of x as p(x) ≡ p(X ≠ x) ≡ p(X<x) + p(X>x) as the distribution for every value of anti x denoted as x or as the anti distribution of x as The anti binomial distribution can be derived as The probability density of an anti normal (or anti Gaussian or anti Gauss or anti Laplace-Gauss) distribution follows as where µ denotes the mean or expectation of the distribution and σ(x)² is the variance.The probability density of an anti Poisson distribution is given by were x is the number of times a very rare event was observed.

Theorem 3.2. (The cumulative distribution)
The cumulative distribution function abbreviated as P(x) or as F(x) or as d.f.(x) or c.d.f.(x) of every random variable X, regardless of whether the distribution of X is continuous, discrete or mixed, for each real number x is defined as for -∞ < x < +∞.

Claim.
For every value x, it is Proof.
For every value x, it is and the theorem follows directly from the definition of the cumulative distribution function as Quod erat demonstrandum.Remark 3.Many times, a key problem is that an exact value of a population proportion p 0 (pronounced "p-naught") is not known while p, the sample proportion is known.In point of fact, we might not even know the total number of subjects comprising a certain population.Under conditions when a population proportion is unknown, we must formulate competing and mutually exclusive hypotheses about such a proportion, collect population representative data, evaluate that data, and determine which hypothesis is supported.The (left tailed) null (H 0 ) and alternative (H A ) hypothesis are under these circumstances as follows: A left tailed p-value which is greater than or equal to α (p-value > α) provides some evidence to accept the null hypothesis while a p-value calculated which is less than α (p-value < α), support the decision to reject the null hypothesis.In other words, under the condition of the validity of the null-hypothesis, the left tailed p value can be calculated using the formula A null hypothesis formulated before the performance of a scientific study should be either accepted or rejected.P-values are one of the useful statistical measures which enable us to some extent to compare the statistical plausibility and clinical relevance of the conclusions drawn about a study finding with respect to a random event.
Historically, the question of the p-value was addressed especially by John Arbuthnott (Arbuthnott, 1710) in 1710 and later by Pierre-Simon Laplace starts the Chapter V of his book "Théorie analytique des probabilités" (Laplace, 1812).Formally, it was Karl Pearson who introduced the p-value (Pearson, 1900) as capital P. In point of fact, Fisher himself proposed in his influential book "Statistical Methods for Research Workers" (Fisher, 1925) the level p-value = 0.05 as a limit for statistical significance (Schervish, 1994).Theorem 3.3.(The p-value for a right tail (upper) event) The p-value for a right tail (upper) event is given by In general, it is In general, it is p((X>x)|H 0 ) = p((X=x)|H 0 ) + p((X>x)|H 0 ).The p-value for a right tail (upper) event is given by Quod erat demonstrandum.Remark 4.
The (right tailed) null (H 0 ) and alternative (H A ) hypothesis are under these circumstances as follows: A p-value calculated which is less than α (p-value < α), support the decision to reject the right tailed null hypothesis while a right tailed p-value which is greater than or equal to α (p-value > α) provides some evidence to accept the right null hypothesis.Let p 0 (A t ← B t ) denote the population proportion of the conditio sine qua non relationship.
A random sample of the size n is drawn from the population.The absolute frequency of the conditio sine qua non relationship within the sample drawn was observed as X(A t ← B t ) = n×p(A t ← B t ) while p(A t ← B t ) = (X(A t ← B t )/ n) denotes the relative frequency of the conditio sine qua non relationship within the sample.What is the probability that there will be X(A t ← B t ) or more cases of the conditio sine qua non relationship within the population?Obviously, the normalized variable z one sided right tailed (Sachs, 1992) becomes The continuity correction (1/(2×n)) becomes smaller as n becomes larger.
Theorem 3.4.(The p-value for a right tail (upper) event under conditions where X = x = n) The p-value for a right tail (upper) event under conditions where X = x = n is given by In general, it is Under conditions where X = x = n, we obtain Mathematically, it is not possible that X > n.Thus far, p((X>n)|H 0 ) = 0.Under these assumptions, the equation before simplifies as Rearranging equation, it is Mathematically it is p((X>x)|H 0 ) = 1-p((X<x)|H 0 ).The p-value for a right tail (upper) event under conditions where X = x = n, is given by Quod erat demonstrandum.
The results one expects to obtain if some underlying assumption is true and the results observed while using some experimental data can differ by chance or systematically.

Theorem 3.5. (p value according to Poisson Distribution)
A binomial distribution is a sum of n independent Bernoulli random variables with the probability π.For very high or very low π, a binomial distribution is a very skewed distribution.Under conditions with very low π probability and very large n, the Poisson distribution may be used as an approximation to the binomial distribution.In practice it is possible not to observe a conditio sine qua non relationship within a sample even if within a population, such a relationship is given.Events like these can be accepted only under very limited circumstances and should be extremely small with the consequence that the law of rare events or Poisson limit theorem can be used to test the significance.

Claim.
The left tailed p-value of a Poisson distributed random variable (were x = 0) is given by Proof.
In general, it is Mathematically, the left tailed p-value is defined as p(X<0) = 1-p(X>0).Rearranging equation before, we obtain The Poisson distribution is given by Under conditions where x = 0 we obtain and the left tailed p-value under these conditions (λ=n×p) is given by A left tailed p-value which is greater than or equal to α (p-value > α) provides some evidence to accept the null hypothesis while a p-value calculated which is less than α (p-value < α), support the decision to reject the null hypothesis.
The Poisson distribution is given by were x is the number of times an event occurs in an interval and x can take values 0, 1, 2, …., e is Euler's number (the number 2.71828..., the base of the natural logarithms) and x! is the factorial of x or x!=x×(x-1)×(x-2)×... ×2×1 and λ= N×p is a positive real number or the mean or equal to the expected number of rare occurrences of an event (i.e. no conditio sine qua non relationship observed).Under the condition of the validity of the null-hypothesis, the left tailed Poisson p value can be calculated using the formula The sample size is again n = 172.The relative frequency of the conditio sine qua non relationship is E(H(X n ))=169/172=0.98255814and p = 0.05.In other words, the conditio sine qua non relationship was not observed in toto in 3 out of 172 cases.According to our left tailed hypothesis, we are of the opinion that this probability is greater or equal to 0.05.The left p-value can be calculated as In other words, the probability that a conditio sine qua non relationship within the population will not be observed is less than 0.05.
Theorem 3.6.(The exact probability of a single event) Let X denote a real-valued function defined on a sample space, a random variable, with a finite number of finite outcomes x 1 occurring with probability p(X = x 1 ), x 2 occurring with probability p(X = x 2 ), …, x n occurring with probability p(X = x n ).The collection of all of these probabilities denotes the distribution of the discrete or continuous random variable X.A discrete distribution is characterized by its probability mass function (p.m. f.).
A continuous distribution is characterized by its probability density function (p.d. f.).Let E(x t ) denote the expectation value of a single event x t .Let E(x t 2 ) denote the second moment expectation value of a single event x t .
In general, the probability that an event will occur is expressed as a number between +0 and +1 and can be defined in many different ways.For our purposes, the probability of event, which has a value or quantity x t is represented by p(x t ) and we define the probability that a single event has the value x t at the Bernoulli trial t by the relationship while E(x t ) denotes the expectation value of a single event and ψ(x t ) denotes the wave function of the random variable x t and ψ*(x t ) denotes the complex conjugate of the wave function.Such a definition of probability assumes that every single event is associated with its own expectation value even under circumstances where p(x t ) = 1.Under these conditions it is equally E(x t ) = x t .In other words, it is or Published by IDEAS SPREAD while the definitions above are independent of the distribution of x t .The variance of a single event x t denoted as σ(x t )² is independent of the distribution of x t and defined as Claim.
In general, the distribution free exact probability of a single event is given by .
Under conditions of a Binomial distribution, it is Remark 7.
According to Chebyshev's inequality, we obtain approximately while the number E(X) is also called the mean of X or the expected value of X.The terms mean, expected value or expectation value are used interchangeably.
Theorem 3.7.(The approximate probability p of a single event) Claim.
In general, the probability p of a single event is given approximately by as long as the number of trials n goes to positive infinity (n → +∞) while X = n×p denotes the number of successes occurred anywhere among the n trials.

Proof.
In general, it is +1 equal to +1 (lex identitatis (von Leibniz, 1765;Barukčić, 2016)) or Multiplying by p, we obtain 1×p=1×p or were p is denoting the probability of a single event.Let n denote something like the number of trials or the sample size et cetera.Performing the power operation, it is According to mathematical requirements it is p ≡ q or p ≡ 1-p ≡ 1-q and λ ≡ n×p and λ ≡ n×p ≡ n×q ≡ n×(1-p).
Rearranging the equation before it is Taking the limit as the number of trials as n goes to positive infinity (n → +∞), we obtain Are such observations appropriate at all to justify some predictions about observations we have not yet made or a reality, we are still not aware of or may be even with regard to general claims which go far beyond the observed?
The question is of course are we allowed to infer a hypothesis about the general situation based on the observation of a limited sample?In other words, how (long) can we be uncertain about the unknown, the infinitely empty, the unobserved, on what ground and to what extent?One may object that any analysis of the notions of cause and effect is confronted by the unobserved and the not completely known too.On this view, it is not the main goal of this paper to solve the famous philosophical problem of induction and inductive inference as introduces by David Hume in Book 1, part iii, section 6 in 1739 in his book "A Treatise of Human Nature" (Hume, 1739).However, in order to approach to the solution of this problem it is necessary to point out that under certain circumstances logic, mathematics and statistics are able to provide us to some extent with methods of direct inference even about the unknown.

Theorem 3.8. (The distribution of likely events)
In general, under conditions where X = x = n, it is were the probability of a single event is given by p and n is the sample size as the number of trials n goes to positive infinity (n → +∞).

Proof.
Historically, the binomial probability mass function of observing exactly x successes in n trials, with the probability of success on a single trial denoted by p and q = 1-p or p + q=1 was derived by the prominent Suisse mathematician Jacob Bernoulli (1655Bernoulli ( -1705) ) in his work Ars Conjectandi (Bernoulli, 1713) as The mathematical formula to find probabilities in the binomial distributions may by very simple but to do the calculation itself can be pretty troublesome.A binomial distribution with parameters p and n = 1 is called the Bernoulli distribution with parameter p while x can take the values either +0 or +1 and it is were p = 1-q.Under conditions where X = x = n, it follows that or that p( = ) = (1 − ) Defining λ= n×q = n×(1-p), we obtain Taking the limit as the number of trials as n goes to positive infinity (n → +∞), we obtain According to elementary (DeGroot et al., 2005) calculus it is In this context, the probability to obtain x=n successes drawn with replacement from a population in a sequence of n independent Bernoulli trials (experiments) with q = 1 -p and λ= n×(1-p) as the number of trials n goes to positive infinity (n → +∞) is given by were the probability of a single event is given by p. Quod erat demonstrandum.Remark 9.
The binomial distribution can be approximated (de Moivre, 1733) by the normal distribution too.The accuracy of such an approximation depends on several factors and first requires some pre-calculations.A rule of thumb using the normal distribution to approximate binomial probabilities is good if both (n×p) > 5-10 and if (n×(1-p)) > 5-10.Especially under conditions where the number of successes x is equal to the number of trials n or goes to n, such an approximation may provide inaccurate probabilities.As proved in this publication, an approximate p-value for a right tail (upper) event under conditions were our expectation is that X = x = n, is given by Theorem 3.9.(The Chi-square goodness-of fit test of a necessary condition) Unfortunately, there is always the possibility that the results of a study may be wrong and sometimes, a difference observed during an investigation is just the result of random subjective or objective errors or random effects.A statistical test is more or less about managing such and similar risks by the tools of probability theory and not about certainty.In point of fact, a true null hypothesis (there is no difference) should be accepted.

Claim.
In general, a mathematical formula of the Chi-square goodness-of fit test of a necessary condition can be derived as Proof.
The conditio sine qua non relationship of a population is defined as (Table 1) or as To see how this applies to the theorem above, let's simplify the equation before as or as In general, it is p(B t ) = 1-p(B t ).Ultimately, for this reason, a conditio sine qua non relationship simplifies under the press of mathematics as Multiplying by n, we obtain which is equivalent with Rearranging, it is Dividing by B, it is or More precisely, the complementary event considered, it is or It is B = a + c.More broadly, the equation reduces to Finally, the clearness, beauty and simplification provided by equation before yields the Chi-square goodness of fit test of a necessary condition without using the continuity correction, as Quod erat demonstrandum.Remark 10.Depending upon personal taste, another method to calculate the chi-square value of a conditio sine qua non relationship with the continuity correction as demonstrated before can be derived as The discussion, whether the use of the continuity correction is necessary at all, is still not closed.In a similar way, the chi-square value of other relationships defined in this paper can be derived.Objections to the necessary condition can been made from a number of backgrounds.For example, without oxygen and water, there would be no human life on this planet.Hence, oxygen and water are necessary conditions for the existence of human beings.
In other words, without oxygen (gaseous, certain period of time et cetera) no existence of human beings on this planet.In the same respect, without water (for certain period of time et cetera) no existence of human beings on Published by IDEAS SPREAD this planet too.Thus far, even if water is given, without oxygen human beings on this planet will not survive.
Central to the goal of specifying at least in part a necessary condition is the fact that the same does not depend on the existence of other necessary conditions.Even under conditions were several necessary conditions must be given for an event or a random variable to occur, if one single necessary condition is not met, the event cannot occur.On this view, it is not surprising that the notion of necessary condition is used too, to define an event as the cause of another event.If an event A t is a necessary condition of another event B t and if both events are equally causally related, then the event A t is the cause of B t .Moreover, in point of fact it seems to be doubtful whether the event A t can be regarded as the only cause of B t .Broadly speaking, because of these objections and even if it seems grossly oversimplified, a straightforward way to give a precise and comprehensive account of the notion the only cause of an event is a necessary and sufficient condition relationship (Hassani et al., 2018;Barukčić, 2018).
Theorem 3.10.(Self-contradictory data I) Let p(A t ) denote the probability of the condition (i.e.risk factor), let p(B t ) denote the probability of the conditioned (i.e. the outcome), let p(A t and B t ) denote the joint probability that A t and B t will occur/has occurred.Under conditions were the relationship between two random events abbreviated as A t and B t is determined by a necessary condition it is p(A t ← B t ) = p(A t and B t ) +(1 -p(B t )) =1 and equally p(A t and B t ) = p(B t ).

Claim.
In general, under circumstances where p(A t ) < 1 and p(A t and B t ) = p(B t ) it is Under conditions where p(A t ) = 1 it follows that k(A t and B t ) = 0 and mathematically, A t and B t have to be treated as being independent of each other.In many problems, data gained from some observations provide an opportunity to increase the degree of confidence, when a decision is made to either accept the null hypothesis or accept the alternative hypothesis.Clearly, the null hypothesis and the alternative hypothesis are mutually exclusive thus that either the null hypothesis is false and the alternative hypothesis is true or the null hypothesis is true and the alternative hypothesis is false.In other words, a study design which provides data supporting the null-hypothesis: without A t no B t cannot at the same time support the hypothesis that k < 0. Such data are self-contradictory and cannot be used for further analysis.

Claim. Published by IDEAS SPREAD
In general, an exclusion relationship demands that In other words, a study design which provided data with significant evidence that A t excludes B t and vice versa should equally yield a causal relationship which is k(A t ,B t ) < 0, otherwise the data are potentially biased and should be treated as self-contradictory.
Theorem 3.12.(Self-contradictory data III) Let p(A t ) denote the probability of the condition (i.e.risk factor), let p(B t ) denote the probability of the conditioned (i.e. the outcome), let p(A t and B t ) denote the joint probability of A t and B t .Under conditions were the relationship between A t and B t is determined by a sufficient condition it is p(A t → B t ) = p(A t and B t ) +(1 -p(A t )) =1 and it is equally p(A t and B t ) = p(A t ).In general, under circumstances where p(B t ) < 1, it is Under conditions where p(B t ) = 1 it follows that k(A t and B t ) = 0 and A t and B t must be treated as being independent of each other.In particular, data which provide evidence that A t is a sufficient condition of B t must not in the same respect provide evidence that there is a significant cause effect relationship too.In fact, our ability to recognize conditions or risk factors might be seriously endangered by treating a cause as being identical with a condition.A cause is a condition too but not vice versa.A condition must not be a cause.Therefore, and due to mathematical requirements, a significant cause effect relationship is not necessary to establish a significant sufficient condition relationship.The analysis of alleged examples can show, among other things, how sufficient conditions should be understood, especially with relation to causation.To get clear on this, we can consider the following conditio per quam hypothesis: "If an elephant has four legs, then a sun is made of gas".An investigation is performed and it is found that elephants have more or less four legs while the sun (our sun) is made of gas.Rare events can lead to the fact that there are elephants with less than four legs.Still, a senseless and erroneous conclusion could be that elephants with for legs are a conditio per quam of a sun made out of gas.However, whether such a hypothesis makes any sense at all and the investigation performed must be analyzed very precisely.Circumstances where there are elephants with four legs but the sun considered is not made out of gas must be considered too.Furthermore, conditions where there are elephants with less than four legs and the sun considered is not made out of gas must be considered too to avoid logical fallacies.
Theorem 3.13.(A fair study design I) A study design which demands or assures that A < (n -B) or that that A > (n -B) can yield different Chi-square values calculated and the question arises, which X² is the correct one and is allowed to rely on?

Claim.
A study design from the standpoint of a conditio per quam relationship is fair and the data are formally not selfcontradictory due to study design if Proof.
The Chi-square value of a conditio per quam relationship is demands that Both methods applied on a data body should yield the same Chi-square value.In other words, it is For preliminary reason, define (-b) 2 ≡ 1 and rearrange equation, it is Quod erat demonstrandum.
Theorem 3.14.(A fair study design II)

Claim.
A study design from the standpoint of a conditio sine qua non relationship is fair and the data are formally not selfcontradictory due to study design if Proof.Published by IDEAS SPREAD The Chi-square value of a conditio sine qua non relationship demands that The study design should assure that if both methods applied on the same data body, both methods should yield the same Chi-square value.In other words, it is Define (-c) 2 ≡ 1 and rearrange equation, it is Quod erat demonstrandum.Theorem 3.15.(A fair study design III) The guarantee of a fair study design is fundamental in any empirical scientific research and of every modern medical investigation.The framework of a fair study design should obey especially the principle of equality of arms which is a central feature of every scientific combat to ensure completely only the discovery of the truth.The principle of equality of arms leaves no room for defending material interest, ideological position or wishful thinking but requires that advocates of a special null hypothesis and opponents of the same null hypothesis have the same (mathematical-statistical) chance or possibilities to reject or to accept the null-hypothesis at their disposal.One could sum up the principle of equality of (scientific) arms by saying that no party should have an unfair advantage over the other party especially due to study design.Put in other terms, any scientific research is not complete without the notion of fairness.Ignoring the historical origins and theoretical foundations of the principle of equality of (scientific) arms a fair and careful study design directed to the goal that a correct null-hypothesis has to be accepted and that a false null-hypothesis has to be rejected is the core of evaluations to determine how believable a hypothesis is.Independently of the extent of the data to be recorded or the type of the study (casecontrol study, cohort study et cetera), formally, the design of the study must ensure that the analyzing results of the data generated are the same.In the following, this problem will be analyzed from the standpoint of the research on secondary data (i.e. case control studies) and the research on primary data under ideal conditions (No bias, no systematic errors, perfect accuracy of a measuring instrument et cetera).Aside from the type of study, we intended to give the same answer to scientific questions and to gain the same new knowledge.

Claim.
A study design which demands that a t = d t is fair and the data are formally not self-contradictory due to study design if Proof.
Sometimes, study design demands or assures conditions or a sample were were a denotes the number of subjects (exposed and diseased) while d denotes the number of subjects (not exposed and not diseased).Adding (b + c + d) to the equation before, it is or Published by IDEAS SPREAD In general, it is n = (a + b + c + d).Furthermore, it is A = (c + d) and B = (b + d).The equation changes to In other words, under these conditions, the study design demands equally that It is A = (n -A) and the equation before simplifies as It is A = (n -A) and B = (n -B).The equation above derived as simplifies as or to The condition a = d is assured too, if Quod erat demonstrandum.
Theorem 3.16.(A fair study design IV)

Claim.
A study design from the standpoint of an exclusion relationship is fair and the data are formally not selfcontradictory due to study design if Proof.
The Chi-square value of an exclusion relationship demands that Both methods applied on a data body should yield the same Chi-square value.In other words, it is The study design to test the exclusion relationship demands that A = B and that A = B.These circumstances can be identical with the demand that A = B = A = B but must not.In other words, an exclusion relationship demands a study design were A=B.Adding B it is Quod erat demonstrandum.Theorem 3.17.(A fair study design V) Claim.
A study design from the standpoint of conditio sine qua non and conditio per quam and an exclusion relationship is fair too and the data are formally not self-contradictory due to study design if Proof.
The Chi-square value of an exclusion relationship demands that A study design from the standpoint of conditio sine qua non and conditio per quam relationship can be regarded as fair if n = A+B.Substituting this relationship into equation before, it is At the same time, the study design should be fair with respect to an exclusion relationship.In this case, it is equally true that A = B. We obtain Quod erat demonstrandum.Theorem 3.18.(A fair study design VI)

Claim.
A study design which investigates the causal relationship between A t and B t should respect especially the law of independence.Whether the absence of independence may be or not be one aspect of the causal relationship or not, study design should ensure that an independence of A t from B t and vice versa can be recognized.Under the assumption of independence of A t from B t study design is fair and the data are formally not self-contradictory due to study design if Proof.
Under the assumption of independence, it is Multiplying by n×n, it is or Rearranging equation, we obtain In other words, under the condition of independence, study design should fulfill the requirement that We define ((n×a)-(A×B)) = 1.The equation before simplifies to Under the assumption of independence, a study design is fair too, if Quod erat demonstrandum.

Example.
Under the assumption of a conditio sine qua non relationship or of a conditio per quam relationship study design should assured that Under the condition of independence where A = B we obtain where n is the sample size drawn from the population.
Theorem 3.19.(Index of unfairness) Aside from personnel, financial, organizational and logistical questions of a study, the scientific value of a (medical) study is determined especially by factors like study design, statistical methodology used, sample size calculations and a properly selected, highly representative study population with defined and selective inclusion and exclusion criteria.The extent to which the measuring technique and instruments used consistently provide the same results if measurements are repeated should be accurate enough.The significance of study design for the quality of the conclusions drawn is often underestimated.In point of fact, errors in the statistical evaluation can be corrected after the study has been completed.In contrast to errors in the statistical evaluation it is difficult to correct errors in study design afterwards.A number of potential problems may bias the results of observational studies or even or well-planned, experimental randomized clinical trials.Nevertheless, even if many questions in human medicine can only be answered with observational studies medical research studies itself provide already additional statistical information whether the data of a study can be considered to be evaluated by statistical test procedures to generalize the results from the sample for the whole population.The relation between data and hypothesis is of key importance in almost all empirical research and those who plan to perform a study which should make an important contribution to medical knowledge must occupy themselves intensively with an appropriate and careful study design.Statistical methods which are relating hypothesis in the light of empirical facts may enable us even to extrapolate from data to predictions and general facts.Some of the main methodological problems can be avoided if the foundations of statistical methods are logically and mathematically correct.Data have an impact on a hypothesis, but the impact should depend on the data themselves and not just on the study design of the researcher.The underlying question arises, therefore, how can such a problem (even ex post) be operationalized, meaning that it must be converted into an evaluable and measurable form.

Claim.
The index of unfairness (IOU) can be derived as Proof.
Under conditions were the data of a study are analyzed by a chi-square goodness of fit test of a necessary condition, the study design should assure, that the same chi square value should be achieved.In other words, it is The index of unfairness, abbreviated as IOU, follows as Quod erat demonstrandum.
The range of A is 0 < A < n, while the range of B is 0 < B < n.A study design based on A=B=0 leads to an index of unfairness of IOU = (((0+0)/n)-1) = -1.A study design which demands that A=B=n leads to an index of unfairness of IOU = (((n+n)/n)-1) = +1.
Theorem 3.20.(The Chi-square distribution independent index of unfairness) There are a number of measures that are of importance for epidemiologists and other.Which measure of disease frequency one should use depends on the type of study populations and other specific risk factors.Among these measures are the prevalence (Noordzij et al., 2010), the incidence and other.Aiming to investigate the importance of a condition, an individual risk factor et cetera, often it is necessary to compare the risk of the outcome in a nonexposed group to the risk of the outcome in an exposed group.Prevalence is of help in many discussions of risk assessment and able to describe how often a factor, a condition, a disease or another health event occurs in a population.In general, the prevalence reflects the number of existing cases of a disease.Let p1 A denote the absolute frequency, the number of subjects within a whole population P1 having the condition, risk factor, the disease A at a (period of) time/point t.Let p1 N denote the size of the whole population P1.In other words, let the random variable p1 A be equal to the number of p1 N independent repetitions of a random experiment of the whole population P1 with the probability p( p1 A) of success.That is, p1 A t is binomial random variable.The ratio of p1 A and p1 N denoted as p( p1 A t ) is called the relative frequency or the prevalence of p1 A and defined as Let p2 B denote the absolute frequency, the number of subjects within a whole population P2 having the condition, risk factor, the disease B at a (period of) time/point t.Let p2 N denote the size of the whole population P2.The prevalence of p2 B denoted as p( p2 B t ) is defined as It is of course possible, that the prevalence is calculated from the same population or that P1 and P2 are identical.We are under ideal conditions and the prevalence represents the real existing cases of a disease, a risk factor et cetera within a certain and completely known population.The whole target population corresponds to the entire set of subjects whose characteristics is investigated by a study.A sample as a finite part or subset of participants drawn from such a target population.Based on results obtained from such a sample, researchers try to draw some conclusions about the target population with a certain level of confidence.A sample contains fewer individuals than the whole population, but the representativeness of a sample should be preserved as much as possible to assure a valid statistical inference.A lack of representativeness of a sample i. e. due to the process through which individuals are selected from the population and due to other several factors can have fundamental, negative impact on the validity of the conclusions drawn.In this context, the 2 × 2 contingency table as defined at the beginning of this article is a handy tool to understand the following definitions.Let A denote the absolute frequency (the expectation value et cetera), the number of subjects within the sample drawn from the population having the condition, the risk factor, the disease A at a (period of) time or  Mlodinow, 2008).This is meanwhile known as the law of large numbers and for the first time proved by the famous Swiss mathematician Jakob Bernoulli in 1713 (Bernoulli, 1713).According to the law of large numbers the relative frequency with which an event actually occurs p(A t ) and the "true" probability of the same event itself p( p1 A) should be approximately the same as long as an experiment is repeated a large number of times independently and under identical conditions.Thus far, let the random variable p1 A 1 , p1 A 2 , …, p1 A N be an independent trial process, with a finite expected value denoted as μ and calculated as μ=p( p1 A t )= ( p1 A 1 + p1 A 2 + …+ p1 A N )/ p1 N with p1 A= ( p1 A 1 + p1 A 2 + …+ p1 A N ) and a finite variance.It is generally valid that for any ε > 0, Bernoulli's law of large numbers assures (Scheid, 1992) that 194) or Converting a scientific research question into an appropriate methodological and clinical study design to be able to draw meaningful inference from the data analyzed is a real challenge.Particular attention must be paid to the methods used to calculate the sample size.Based on these findings, is it necessary to point out that the methods how to calculate the magnitude of (ε) will not considered in detail.This is related to various methods of exact sample size calculation (Charan & Biswas, 2013) and can be found in literature.Still, using Chebyshev's inequality (Bienaymé, 1846;Tchébychef, 1867) on ( p1 A / p1 N) to exemplify Bernoulli's law of large numbers in more detail, we will obtain a solution for a fair study design independent of the distribution used while relying on the Chebyshev inequality (Bienaymé, 1846;Tchébychef, 1867).Thus far let us assume a study design has assured conditions were the deviation of the relative frequency p(A t ) of a sample deviates from the relative frequency p( p1 A) of a whole population less than ε > 0. The probability p(⏐p(A t ) -p( p1 A)⏐< ε) that a relative frequency p(A t ) of a sample deviates less than ε from the relative frequency p( p1 A) of a whole population can be calculated very precisely.By Chebyshev's Inequality (Scheid, 1992), for any ε a > 0, Bernoulli's law of large numbers (Hogg and Craig, 2004, p. 119-120) for fixed ε a becomes Published by IDEAS SPREAD In regard to the deviation of the relative frequency p(B t ) of a sample from the relative frequency p( p2 B t ) of a whole population P2 we obtain Claim.
Under conditions of the validity of Bernoulli's law of large numbers, the index of unfairness indicating a fair study design follows as In general, it is +1 equal to +1 (lex identitatis (von Leibniz, 1765;Barukčić, 2016)) or Multiplying by p, we obtain 1×p=1×p or = p (200) were p is denoting the probability that a relative frequency of a sample deviates less than ε from the relative frequency of a whole population.In other words, it is p = p(⏐p(A t ) -p( p1 A)⏐<ε), we obtain By the same sample, we are investigating a second random variable B. We assume that the probability of the relative frequency of a sample p(B t ) deviates less than ε from the relative frequency p( p1 B) of a whole population too.In other words, it is p = p(⏐p(At) -p(p1A)⏐<εa) = p(⏐p(Bt) -p(p1B)⏐<εb).We obtain while ε a can but must not be equal to ε b .In general, according to Bernoulli's law of large numbers, it is or In the following, we consider conditions were na×(εa) 2 = nb×(εb) 2 .This implies the case that ε a is much smaller than ε b too with the consequence that n a must be much greater than n b and vice versa.This includes especially conditions, were ε a = (x/100)×p(A t ) while (0< x < 100) or were ε b = (y/100)×p(B t ) while (0< y < 100 In other words, it is while preforming some investigations, a sample with the size n is investigated while the above conditions should be satisfied.Thus far multiplying by the sample size n, it is It is not completely unjustified to note that the index of unfairness (IOU) indicates to some extent the degree to which the representativeness of a sample is preserved even thought this condition is not given all the time.Thus far, in general, the index of unfairness cannot be regarded as the degree to which the representativeness of a sample is preserved.Under some restricted conditions, the Chi-square goodness-of fit test of the index of unfairness can be defined something as

Discussion
Failure to apply rigorous standards and appropriate statistical methods in biomedical research on data collected from a valid scientific design can lead to misleading conclusions and the costs to patients and society could be high.Briefly, many times a very large sample is required to discover a small difference.Still, the sample size of medical studies is often too small, thus that the power is also too small and a relationship is either only described imprecisely (Moher et al., 1994) or even unidentified.Anyone who denies that a very careful calculation of the sample size of a study is one of the problems of scientific research may characteristically insist that the measurements itself may be invalid or false and may lead to erroneous conclusions too.Therefore, whatever the sample size calculation and other potential sources of bias, the selection of an appropriate study design, free of publication bias, is important too, to be able to generalize the study results to the population.Funnel plots are one of the several attempts made to assess the magnitude of publication bias.Many meta-analyses show Funnel plots as proposed 1984 by Light and Pillemer (Light and Pillemer, 1984) to examine whether there is evidence against or for the presence of publication bias.The funnel plot is a kind of a scatter plot with some measure of weight (such as the sample size, the inverse variance, the standard error et cetera) on the vertical axis and the treatment effect on the horizontal axis.More precisely, the Funnel plot's (plots of effect estimate against sample size) wide popularity followed an article published in the BMJ in 1997 (Egger et al., 1997) is not unrestrictedly justified.The capacity of Funnel plot to detect publication bias in meta-analyses is often misleading (Lau et al, 2006) and equally inaccurate especially for meta-analyses of proportion studies (Hunter et al., 2014) with low proportion outcomes.
To date, funnel plot may overlook serious bias (Lau et al, 2006) and it is still unclear whether funnel plots really diagnose publication bias (Zwetsloot et al., 2017) at all.The use of the index of unfairness is more appropriate in this context.Medical studies with favorable results are published more often, and more quickly, than trials with negative findings which can lead to publication bias (Hopewell et al., 2009).As a result of such a scientific practice or scientific evolution or a natural (peer-reviewed dominated) selection, main-stream compatible or source of funding adequate favorable results are published more often than other ones.On the long run, there is a serious overestimation of the effects in the literature found and a damaging and escalating effect on the integrity of scientific knowledge is possible.Unfortunately, there is evidence suggesting that this systematic publication bias documented in the literature for decades is increasing (Joober et al., 2012).Finally, a prevention of publication bias is of course much more desirable than a corrective or diagnostic analysis or finding or excluding of publication bias.With regard to the prevention of publication bias the index of unfairness is of use since the same can help to reduce errors due to study design.Nonetheless, the knowledge of publication bias even ex-post due to index of unfairness is possible and needed to detect or to avoid inconsistent conclusions too.The index of unfairness and other methods developed in this publication are useful tools for further evaluation of publication bias and can help to reduce the impact of publication bias on the certainty of scientific evidence.

Conclusion
Todays several attempts to assess the magnitude of publication bias require several assumptions which are difficult to ascertain and are labeled with various other limitations.The presence of publication bias can be determined even ex post while using the index of unfairness.

Financial Support and Sponsorship
Nil.

8 )
Definition 2.1.9.(The X² Test of Goodness of Fit of a Necessary Condition) (The X² Test of Goodness of Fit of the Exclusion Relationship)

1 )
Definition 2.1.14.(The Binomial distribution) of trials n goes to positive infinity (n → +∞) the equation above simplifies as , the probability p in a sequence of n independent Bernoulli trials (experiments) with q = 1 -p and λ= n×(1-p) as the number of trials n goes to positive infinity (n → +∞) is given approximately by 210)Since A= p(A t )×n and B = p(B t )×n, it is unfairness, abbreviated as IOU, follows under conditions of the validity of Bernoulli's law of large numbers as

Figure 1 .
Figure 1.Index of unfairness and probability

Table 1 .
of A t and B t it is generally valid that The probabitlities of a contingency table Let c t = 1 if the t-th outcome is a success and 0 if it is a failure.Then c = (c 1 + c 2 + ... + c n ) is the number of successes in n Bernoulli trials (period of time) t.It is p(c t )= p(A t ∩ B t ) the joint probability of (A t and B t ) and Let d t = 1 if the t-th outcome is a success and 0 if it is a failure.Then d = (d 1 + d 2 + ... + d n ) is the number of successes in n Bernoulli trials (period of time) t.It is p(d t )= p(A t ∩ B t ) the joint probability of (A t and B t ) and Definition 2.1.3.(Atwo-way or contingency table)In this context, let us define that p(A t ) = p(a t )+p(b t ) or p(A t ) = p(A t ∩B t )+ p(b t ) or p(A t ) = p(A t ∩B t )+p(A t ∩B t ) while p(A t ) is not identical with p(a t ).Thus far, it is p(B t ) = p(a t )+p(c t ) or p(B t ) = p(A t ∩B t ) +p(c t ) and equally p(B t ) = 1-p(B t ) or p(B t ) = p(b t )+p(d t ).Since the joint probability of A t and B t is denoted in general by p(A t ∩B t ), it is p(A t ∩B t ) = p(A t ) -p(b t ) or p(A t ∩B t )=p(B t ) -p(c t ) or p(B t ) + p(b t ) -p(c t ) = p(B t ) + p(Λ t ) = p(A t ).There may exist circumstances where Λ t is identical or associated with Einstein's cosmological 'constant'.In general, it is p(a t )+p(c t )+p(b t )+p(d t ) = +1.The following table may show the relationship in more details.t= 1 if the t-th outcome is a success and 0 if it is a failure.Then b = (b 1 + b 2 + ... + b n ) is the number of successes in n Bernoulli trials (period of time) t.It is p(b t )= p(A t ∩ B t ) the joint probability of (A t and B t ) and Let A denote the complementary random variable of the binomial random variable A with the probability p(A t ).It is A t = (c t + d t ) at the same Bernoulli trial (period of time) t and Let B denote the complementary random variable of the binomial random variable B with the probability p(B t ).It is B t = (c t + d t ) at the same Bernoulli trial (period of time) t andThe meaning of the abbreviations a, b, c, d, n et cetera are explained by following 2 by 2-table.Published by IDEAS SPREAD

Table 2 .
The sample space of a contingency table

Table 3 .
A fair coin.
Thus far we assume that a null hypothesis (H 0 ) is true.Let p(A t ) denote the probability of a condition (i.e. a risk factor), let p(B t ) denote the probability of the conditioned (i.e. the outcome), let p(A t and B t ) denote the joint probability of A t and B t .The relationship between A t and B t is determined in many ways, both can be independent of each other too.Still, under conditions were the relationship between an event A t and another event B t is determined by a necessary condition, a conditio sine qua non, it is p(A t ← B t ) = p(A t and B t ) +(1 -p(B t )) = p(d t ) + p(A t )=1 and of course equally p(A t and B t ) = p(B t ).
Let B denote the absolute frequency, the number of subjects within the sample drawn from the population having the outcome, the condition, the risk factor, the disease B at a (period of) time point t.Let n b denote the size of the sample.The "prevalence" of B of a sample denoted as p(B t ) is defined as In other words, the relative frequency p(A t ) of risk factor A of a sample should not deviate at all or not too much from the relative frequency p( p1 A t ) of risk factor p1 A of the whole population P1.And the same with respect to the outcome.The relative frequency p(B t ) of the outcome B of a sample should not deviate at all or not too much from the relative frequency p( p2 B t ) of outcome p2 B of the whole population P2.To some extent this problem is already solved with respect to the Chi-square distribution as demonstrated before in this article.We obtained the condition that it is appropriate if n = A + B. Still, the Chi-square distribution is sometimes only of limited use.An approach to solve the problem of fair study design independently of the distribution used is of practical importance.Thus far, Gerolamo Cardano (1501-1576), an Italian mathematician stated without a proof that a statistical measure tends to improve as the number of trials increase ( Bernoulli trial t.Let n a denote the size of the sample.The "prevalence" of A of a sample denoted as p(A t ) is defined as Published by IDEAS SPREAD ).Under conditions, were