Site Network: Home | About

Now Trust FTP!!

FTP still continues to be part of many projects when there is a need to transfer the files across the machines. But the question is about its reliablility. It's very useful but unfortunately it's equally unreliable from programming perspective, I believe. Especially those who have some experience on Unix/Linux platform and have love towards $? (exit status of last issued command) would not feel good for FTP.

There are many ftp tools out there in the market that are exception sensitive, but we , including me, are still using the standard/default ftp client that comes with most Linux/Unix/Windows versions. That's not bad as long as there is a way to tune it in our way. In fact ftp does exceptionaly well in terms of speed , logging etc... - all it lacking is error handling. Irrespective of any success/failure of a command or even in the case of connectivity issues ftp always returns "success" ("0") as exit status. That's really annoying, because even a simple "ls" command handles this properly.

$ ls abc
ls: abc: No such file or directory
$ echo $?
1
$

"ls" command returned 1 (failed) as couldn't find a file called "abc". See the below similar attempt on ftp.

$ ftp -n 3.112.45.192< user karteek password
> bye
> E
User karteek cannot log in.
Login failed.
$ echo $?
0
$

This is why I said - it's annoying. Login itself failed, but status is success. I tested, whatever the trash you issue in ftp console, final exit status of ftp is "0" that's success.

So, people have been adding thier own customization to ftp to validate the file transfer. Few of some known logics are like (assuming get command is used, instead put)...

Get all the data files along with and additional control file containing the file_name and size_in_bytes (or just sum of sizes of all files as single value) information. After FTP completion, validate the files in the local machine using the size value specified in control file.

  1. Using FTP status codes. FTP generates status codes upon each command execution, including some additional codes in its life cycle. Status codes can be grepped from ftp log file and execution can be validated. First 3 digits i nthe log file gives the status code. Note that, a simple grep (grep "^XYZ") is not reliable, because when a file is transferred, bytes send or received is also logged with number of bytes at the start of status line). So, we should filter out "bytes sent" (in case of put) and "bytes received" first and then grep the status codes. FTP status codes is listed in the below link. http://www.ftpplanet.com/ftpresources/ftp_codes.htm
  2. You can even read the number of "bytes received" that's being logged for each file transfer in the ftp log file (like I said in above point). Sum up all those values and compare that with the local files' size. This approach is better than 1st one, because first approch requires an explicit aggrement needs to be made between files sender and receiver that a control file is generated each time.

However both methods have one common problem. File sizes may not be same on both remote as well as local machine, especially when the transfer mode is ascii. Suppose, if a file pulled (get) from a windows machine to a unix machine or vice versa, carriage returns are removed or appended to every end of line (because eod-of-line diffrence between windows, \r\n, and linux ,\n). So, if a file is pulled from windows to unix then its local file size would be less than remote file. And moreover, in such ascii PC to Linux transfers warnings like " bare linefeeds received in ASCII mode" are also very common which again would have impact on size.

So, I tried another approach, and it seemd to be doing well as of now.
  1. Start FTP and prompt off
  2. Get the files to local machine (say, using mget)
  3. Create a new file on local machine with file listing of remote directory (using mls or mdir).
  4. Disconnect FTP
  5. No need to try $? - that's mere waste!
  6. Calculate sum of sizes of all the files present in additionally created file
  7. Grep the ftp log file for "bytes received in" and find total of bytes
  8. Compare above 2 values, and they should be same for a successfull file transfer

With this same I've this below script...at least this helps as a foundation or creates some line of thought

###################################################################
# Scrit Name: ValidateFTP.ksh
# by Karteek Yelampally, 03/13/2009
###################################################################
#  Functiono to validate the FTP transfer

Validate_ftp(){
# Arguments to the current function
FTP_LOG=$1
FILE_LST_R=$2 # Validate the ftp transfer

if [ -f ${FILE_LST_R} ]; then # Listing file exists 
   if [ ! -s ${FILE_LST_R} ]; then # Listing file exists but with zero size 
   echo "No files found on the remote machine with the specified pattern(s) ${FILE_PAT}" >> ${FTP_LOG} return 0 
   fi
else # Listing file does not exist 
   echo "Fatal ERROR: File transfer failed. Failure in getting the file listing on remote machine." >> ${FTP_LOG} 
   return 1
fi 
echo >> ${FTP_LOG}

echo "File listing on the remote machine..." >> ${FTP_LOG}

cat ${FILE_LST_R} >> ${FTP_LOG} # Read the ftp log file and calculate number of bytes sent by ftp

FILE_SIZE_L=`grep "bytes received in" ${FTP_LOG} awk '{x+=$1}END{print x}'` # Read the remote file listing and calculate the total files' size on the remote machine

FILE_SIZE_R=`awk '{x+=$3} END{print x}' ${FILE_LST_R}` # Check if the remote files' size and local files' size is equal or not

if [ "${FILE_SIZE_L}" -eq "${FILE_SIZE_R}" ]; then 
   echo "File transfer completed successfully. [${FILE_SIZE_L}] bytes transferred." >> ${FTP_LOG}
else 
   echo "File transfer failed. Size on remote machine [${FILE_SIZE_R}] does not match with Bytes received [${FILE_SIZE_L}]." >> ${FTP_LOG} 
   return 1
fi
return 0
} 

# Start scipt
FTP_HOST=3.1.45.192
FTP_USER=FTP_Security
FTP_PWD=Zz62BaTs 
TGT_HOME=/export/home/karteek/SrcFiles

FTP_LOG=/export/home/karteek/Logs/`basename $0 .sh`_`date +"%Y%m%d%H%M%S"`.log
FILE_LST_R=/export/home/karteek/Temp/list_of_remote_file$$.tmp
echo "Log file is ${FTP_LOG}" FILE_PAT="*DAT" FTP_REMOTE_DIR="/ftp/vendor"
FTP_LOCAL_DIR=${TGT_HOME}/received_files echo "FTP Started"

ftp -n -v ${FTP_HOST} <&1 >> ${FTP_LOG}
user ${FTP_USER} ${FTP_PWD}
cd ${FTP_REMOTE_DIR}
lcd ${FTP_LOCAL_DIR}
prompt
ascii
mget ${FILE_PAT}
mdir ${FILE_PAT} ${FILE_LST_R}
bye
EOFtp

echo "FTP Finished"

Validate_ftp "${FTP_LOG}" "${FILE_LST_R}"

echo "FTP Status is " $? 
###################################################################

"mdir ${FILE_PAT} ${FILE_LST_R}" in the ftp commands creates the file, ${FILE_LST_R}, with remote file listing (having the pattern ${FILE_PAT}) on to local machine.

Let me your comments please. I would like to share your thoughts.

- Karteek

0 Comments:

Post a Comment