Evil_TTL> show | s

Find String in Files with Python

Category:Automation -> Python

For this drill session I created three files to be searched for a string:

file1.txt contents:

string1 file1
string1 
file1 (duplicate

file2.txt contents:

string1 file2 

file3.txt contents:

string1 file3 

I put all files in a single directory. Throughout examples on this page, I varied syntax a bit to demonstrate multiple ways to reach the objective.

The following script will search for a string in all files within a directory then return the names of files which contain the string without repeating filenames if the string matches the search criteria multiple time within the same file:

//Tested with Python 2.7.3

import os
import fnmatch

for rootdirsfiles in os.walk('C:\<path_omitted>\Desktop\python\dir'): // change path
   
for file in files:
       
with open(os.path.join(rootfile), "r") as auto:
            
auto.read(50000)
            if 
"string1" in f:
                  print 
"OK: "file 

Output:

>>> 
OK:  file1.txt
OK
:  file2.txt
OK
:  file3.txt
>>> 

The next script will print full file path + the string searched for:

//Tested with Python 2.7.3

import os
import fnmatch

for rootdirsfiles in os.walk('C:\<path_omitted>\Desktop\python\dir'): // change path
    
for file in files:
        
file os.path.join(rootfile)    
        
with open(file) as f:
           for 
line in f:
              if 
"string1" in line:
                 print 
file": "line.rstrip() 

Output:

>>> 
C:\<path_omitted>\Desktop\python\dir\file1.txt :  string1 file1
C
:\<path_omitted>\Desktop\python\dir\file1.txt :  string1 file1 (duplicate)
C:\<path_omitted>\Desktop\python\dir\file2.txt :  string1 file2
C
:\<path_omitted>\Desktop\python\dir\file3.txt :  string1 file3
>>> 

The following script will print file names only without full path + the string searched for:

//Tested with Python 2.7.3
     
import os
import fnmatch

for rootdirsfiles in os.walk('C:\<path_omitted>\Desktop\python\dir'): // change path
    
for file in files:
        
with open(os.path.join(rootfile)) as f:
           for 
line in f:
              if 
"string1" in line:
                 print 
file": "line.rstrip() // rstrip() will clean extra line breaks to give neat results 

   
Output:

>>> 
file1.txt :  string1 file1
file1
.txt :  string1 file1 (duplicate)
file2.txt :  string1 file2
file3
.txt :  string1 file3
>>> 

The following script will print file name only without full path + the string searched for and save it into a file. If the file hasn’t been created, it will be created by the script:

//Tested with Python 2.7.3

import os
import fnmatch

with open
("test.txt""w") as myfile:  
   for 
rootdirsfiles in os.walk('C:\<path_omitted>\Desktop\python\dir'): // change path
       
for file in files:
           
with open(os.path.join(rootfile)) as f:
              for 
line in f:
                 if 
"string1" in line:
                     
#print file, ": ", line.rstrip()
                     
summary file": "line.rstrip(),"\r"
                     
myfile.write(''.join(summary)) 

   
File contents will be:

>>> 
file1.txtstring1 file1
file1
.txtstring1 file1 (duplicate)
file2.txtstring1 file2
file3
.txtstring1 file3
>>> 
By privilege15