Quote:
Originally Posted by oompa loompa
Well yes, there is a reliable sample size, depending on the amount of variables, and the distribution it comes from, because you fix the probability of precision, (usually 95%, I don't remember if that's just convention or if it maximizes power somehow), so your gains from increasing sample size beyond a point decrease dramatically.
Also, since you calculate your SD from your variance, it is again important to have a large sample size because variance IS something you can test statistically, not with a pvalue, but still. In this example, the number of variables isn't given, but if it's more than 3 I can definitely say that 3 is not enough. Besides, SD is a useless statistic without a sample mean (which I'm assuming is also being calculated), I'm a little rusty (and generally stats is not my strong suit) but since the sample mean is derived from a normal distribution by the CLT, there is going to be an associated tstatistic and pvalue.
It's not misleading if you're not trying to mislead anyone. It is misleading if the claim is that an SD with 3 data points (and less than 3 variables) is a good predictor for what the deviations of the results of future experiments will be. For example, if you get a result greater than x number of SD's for your 4th experiment, how will you know if something went wrong or right? Forget SD, if you don't have a reliable mean, how will you know whether your result was close to what the 'usual' result is or not? If you don't have a reliable variance, how will you know how much deviation is acceptable without being called an abnormality?
At the same time, I agree with you that 3 is better than nothing. Its always going to help, but one would have to say that the results are merely indicative, and not a reliable predictor for how future experiments will pan out.

Sorry guys were, out of it till now.
Of course in an ideal world, we will like to set out a survey of 100 samples with 100 duplicates, but general we don't have enough time and human resources to follow up with that.
What I did was basically set up an experiment which from 10 data point, to produce a trend (something linear, like interviewing 10 people to link the coffee consumption with ages for example). I generally agree that a repeat have to be done to make sure that the results will be accurate.
But when it comes to data analysis, what i think we have to do is: repeat the experiment, and then set up a trend based on the 20 data points (10+10), and draw a line on that, taking the R^2 as variance
My boss meanwhile want to repeat the experiment three times (interview 10 people each), draw a trend, get the slope each time, then do standard deviation(SD) for those 3 just to get SD, which i found really absurd. I means I saw guys in my office even try to put in that +/ sign for his data using only 2 data points before, just to suit my boss taste.