Skip to main content

t-test with minimum number of samples


There are two different ways to justify the use of the t-test. 
  1. Your data is normally distributed and you have at least two samples per group 
  2. You have large sample sizes in each group
If either of these cases hold, then the t-test is considered a valid test. So if you are willing to make the assumption that your data is normally distributed (which many researchers who collect small samples are), then you have nothing to worry about. 
However, someone might reasonably object that you are relying on this assumption to get your results, especially if your data is known to be skewed. Then the question of sample size required for valid inference is a very reasonable one. 
As for how large a sample size is required, unfortunately there's no real solid answer for that; the more skewed your data, the bigger the sample size required to make the approximation reasonable. 15-20 per group is usually considered reasonable large, but as with most rules of thumb, there exist counter examples: for example, in lottery ticket returns (where 1 in, say, 10,000,000 observations is an EXTREME outlier), you would literally need somewhere around 100,000,000 observations before these tests would be appropriate.

Comments

Popular posts from this blog

Python : split any file (binary) to different pieces and join them

This python program takes an input file and then splits it into different smaller chunks. Next it can collect all the different chunks and join it to get the original file. -------------------------------------------------------- # define the function to split the file into smaller chunks def splitFile(inputFile,chunkSize): #read the contents of the file f = open(inputFile, 'rb') data = f.read() # read the entire content of the file f.close() # get the length of data, ie size of the input file in bytes bytes = len(data) #calculate the number of chunks to be created noOfChunks= bytes/chunkSize if(bytes%chunkSize): noOfChunks+=1 #create a info.txt file for writing metadata f = open('info.txt', 'w') f.write(inputFile+','+'chunk,'+str(noOfChunks)+','+str(chunkSize)) f.close() chunkNames = [] for i in range(0, bytes+1, chunkSize): fn1 = "chunk%s&

Installing NS-3 in Ubuntu

Installing testing it and running NS-3 in Ubuntu 10.04 NS-3 is a discrete-event network simulator basically for simulating Internet systems. #The list of Dependency for ns 3.12(sorry if something is missing): sudo apt-get install gcc g++ python sudo apt-get install gcc g++ python python-dev sudo apt-get install mercurial sudo apt-get install bzr sudo apt-get install gdb valgrind sudo apt-get install gsl-bin libgsl0-dev libgsl0ldbl sudo apt-get install flex bison sudo apt-get install g++-3.4 gcc-3.4 sudo apt-get install tcpdump sudo apt-get install sqlite sqlite3 libsqlite3-dev sudo apt-get install libxml2 libxml2-dev sudo apt-get install libgtk2.0-0 libgtk2.0-dev sudo apt-get install libgtk2.0-0 libgtk2.0-dev sudo apt-get install vtun lxc sudo apt-get install uncrustify sudo apt-get install doxygen graphviz imagemagick sudo apt-get install texlive texlive-pdf texlive-latex-extra texlive-generic-extra texlive-generic-recommended sudo apt-get install python-sphinx dia t

Python code: Download shared file from google drive

This python script downloads file from googledrive by using the shareable link of the file. import requests import sys def download_file_from_google_drive(id, destination): URL = "https://docs.google.com/uc?export=download" session = requests.Session() response = session.get(URL, params = { 'id' : id }, stream = True) token = get_confirm_token(response) if token: params = { 'id' : id, 'confirm' : token } response = session.get(URL, params = params, stream = True) save_response_content(response, destination) def get_confirm_token(response): for key, value in response.cookies.items(): if key.startswith('download_warning'): return value return None def save_response_content(response, destination): CHUNK_SIZE = 32768 with open(destination, "wb") as f: for chunk in response.iter_content(CHUNK_SIZE): if chunk: # filter out keep-al