Text file explorer-Determine file headers & delimiters without opening file

Text file explorer

Text File Explorer

I often work with large text files in which the file extension (.txt, .dat, .csv, .tab) doesn’t always indicate what type of delimiter is used in the file.  When the file is small, I’ll typically just “pop” it open in SciTE / Notepad.  Large files (anything over 20 megs) often take a fair amount of time to read and very large files ( a gig or more) will often run into out of memory issues.

On top of wanting to know the delimiter, I also frequently want to know what fields / Headers are in the file.   This normally means I have to open it in a text editor or Excel and review.  I wrote the below Text File Explorer in AutoHotkey script to simplify the above.  I can highlight a file from Windows Explorer and quickly detect the type of delimiter plus display headers if I care to.

Text file explorer AutoHotkey code:

 
#SingleInstance, force
#NoEnv
SetBatchLines -1 ;run at maximum CPU utilization
esc::ExitApp
RAlt::Reload ;Right alt reloads script
RControl:: ;Right Control launches (Use when in Explorer)
;****************************** 
;get row count of file, show type of delimiter, and ask to display headers
;****************************** 
#IfWinActive ahk_class CabinetWClass ;Only run if in Explorer window
  clipboard = ; Empty the clipboard
  SendInput, ^c ;SendInput works better for me.  YOu might try Send or Send Event
  ClipWait, 1
    If ErrorLevel { ;Build in error checking to stop from proceeding if something breaks
        MsgBox, No text was sent to clipboard
        Return
      }
path:=Clipboard ;store path from clipboard

FileReadLine, Header, %path%, 1 ;read first row
Comma_Trans:=StrReplace(header,",",  "`r`n",comma_count) ;store count of commas in header row
  Tab_Trans:=StrReplace(header,A_tab,"`r`n",Tab_count)   ;store count of tabs   in header row


if (comma_Count > Tab_Count) { ;if more commas than tabs...
    StringReplace, Trans,Header, `,,`r`n,All ;replace comma with line breaks
  Delim :="Comma Delimited File" ;set delim variable to be comma
}else if (comma_Count < Tab_Count){ ;if comma count less than tab count...
    StringReplace, Trans,Header, %A_tab%,`r`n,All ;replace tabs with line breaks
    Delim :="Tab Delimited File" ;set delim variabel to be tabs
}

;***********User file object so both fast and not have memory issues*******************    
file := FileOpen(Path, "r") ;open file as read-only
 loop, {
    file.ReadLine()
    Rows++ ;keep track of how many rows

    if File.AtEOF ;if at end of file break loop
          break
}
file.Close()   ;close file object
   
rows:=RegExReplace(Rows, "(?:^[^1-9.]*[1-9]\d{0,2}|(?<=.)\G\d{3})(?=(?:\d{3})+(?:\D|$))", "$0,") ;this makes the number "pretty" with commas

MsgBox,4,%Delim%, %rows% lines found in %path% `n`nDisplay headers?
  IfMsgBox Yes
  {
Clipboard:=Trans  
 MsgBox % Trans
}
return

Here’s a video demonstrating the usage of the Text Explorer

Comments are closed.