Skip to main content

Python : split any file (binary) to different pieces and join them

This python program takes an input file and then splits it into different smaller chunks. Next it can collect all the different chunks and join it to get the original file.
--------------------------------------------------------


# define the function to split the file into smaller chunks
def splitFile(inputFile,chunkSize):

#read the contents of the file
f = open(inputFile, 'rb')
data = f.read() # read the entire content of the file
f.close()

# get the length of data, ie size of the input file in bytes
bytes = len(data)

#calculate the number of chunks to be created
noOfChunks= bytes/chunkSize
if(bytes%chunkSize):
noOfChunks+=1

#create a info.txt file for writing metadata
f = open('info.txt', 'w')
f.write(inputFile+','+'chunk,'+str(noOfChunks)+','+str(chunkSize))
f.close()



chunkNames = []
for i in range(0, bytes+1, chunkSize):
fn1 = "chunk%s" % i
chunkNames.append(fn1)
f = open(fn1, 'wb')
f.write(data[i:i+ chunkSize])
f.close()



#define the function to join the chunks of files into a single file

def joinFiles(fileName,noOfChunks,chunkSize):

dataList = []
for i in range(0,noOfChunks,1):
chunkNum=i * chunkSize
chunkName = fileName+'%s'%chunkNum
f = open(chunkName, 'rb')
dataList.append(f.read())
f.close()
f = open(fileName, 'wb')
for data in dataList:
f.write(data)
f.close()


# call the file splitting function

splitFile('1.mkv',110000000)

#call the function to join the splitted files
joinFiles('chunk',7,110000000)
---------------------------------------------------



Comments

I needed to alter it a bit to get it working - it would write to the file with .joinFiles() since it was closed. This one worked for me in Python2.7:
https://gist.github.com/mattiasostmar/7883550
chandu said…
Modified it a bit more and separated the file into splitter and joiner python files.
https://github.com/csmunuku/file_splitter_joiner

Popular posts from this blog

Installing NS-3 in Ubuntu

Installing testing it and running NS-3 in Ubuntu 10.04 NS-3 is a discrete-event network simulator basically for simulating Internet systems. #The list of Dependency for ns 3.12(sorry if something is missing): sudo apt-get install gcc g++ python sudo apt-get install gcc g++ python python-dev sudo apt-get install mercurial sudo apt-get install bzr sudo apt-get install gdb valgrind sudo apt-get install gsl-bin libgsl0-dev libgsl0ldbl sudo apt-get install flex bison sudo apt-get install g++-3.4 gcc-3.4 sudo apt-get install tcpdump sudo apt-get install sqlite sqlite3 libsqlite3-dev sudo apt-get install libxml2 libxml2-dev sudo apt-get install libgtk2.0-0 libgtk2.0-dev sudo apt-get install libgtk2.0-0 libgtk2.0-dev sudo apt-get install vtun lxc sudo apt-get install uncrustify sudo apt-get install doxygen graphviz imagemagick sudo apt-get install texlive texlive-pdf texlive-latex-extra texlive-generic-extra texlive-generic-recommended sudo apt-get install python-sphinx dia t

Python code: Download shared file from google drive

This python script downloads file from googledrive by using the shareable link of the file. import requests import sys def download_file_from_google_drive(id, destination): URL = "https://docs.google.com/uc?export=download" session = requests.Session() response = session.get(URL, params = { 'id' : id }, stream = True) token = get_confirm_token(response) if token: params = { 'id' : id, 'confirm' : token } response = session.get(URL, params = params, stream = True) save_response_content(response, destination) def get_confirm_token(response): for key, value in response.cookies.items(): if key.startswith('download_warning'): return value return None def save_response_content(response, destination): CHUNK_SIZE = 32768 with open(destination, "wb") as f: for chunk in response.iter_content(CHUNK_SIZE): if chunk: # filter out keep-al