For this drill session I created three files to be searched for a string:
file1.txt contents:
string1 - file1
string1 - file1 (duplicate)
file2.txt contents:
string1 - file2
file3.txt contents:
string1 - file3
I put all files in a single directory. Throughout examples on this page, I varied syntax a bit to demonstrate multiple ways to reach the objective.
The following script will search for a string in all files within a directory then return the names of files which contain the string without repeating filenames if the string matches the search criteria multiple time within the same file:
//Tested with Python 2.7.3
import os
import fnmatch
for root, dirs, files in os.walk('C:\<path_omitted>\Desktop\python\dir'): // change path
for file in files:
with open(os.path.join(root, file), "r") as auto:
f = auto.read(50000)
if "string1" in f:
print "OK: ", file
Output:
>>>
OK: file1.txt
OK: file2.txt
OK: file3.txt
>>>
The next script will print full file path + the string searched for:
//Tested with Python 2.7.3
import os
import fnmatch
for root, dirs, files in os.walk('C:\<path_omitted>\Desktop\python\dir'): // change path
for file in files:
file = os.path.join(root, file)
with open(file) as f:
for line in f:
if "string1" in line:
print file, ": ", line.rstrip()
Output:
>>>
C:\<path_omitted>\Desktop\python\dir\file1.txt : string1 - file1
C:\<path_omitted>\Desktop\python\dir\file1.txt : string1 - file1 (duplicate)
C:\<path_omitted>\Desktop\python\dir\file2.txt : string1 - file2
C:\<path_omitted>\Desktop\python\dir\file3.txt : string1 - file3
>>>
The following script will print file names only without full path + the string searched for:
//Tested with Python 2.7.3
import os
import fnmatch
for root, dirs, files in os.walk('C:\<path_omitted>\Desktop\python\dir'): // change path
for file in files:
with open(os.path.join(root, file)) as f:
for line in f:
if "string1" in line:
print file, ": ", line.rstrip() // rstrip() will clean extra line breaks to give neat results
Output:
>>>
file1.txt : string1 - file1
file1.txt : string1 - file1 (duplicate)
file2.txt : string1 - file2
file3.txt : string1 - file3
>>>
The following script will print file name only without full path + the string searched for and save it into a file. If the file hasn’t been created, it will be created by the script:
//Tested with Python 2.7.3
import os
import fnmatch
with open("test.txt", "w") as myfile:
for root, dirs, files in os.walk('C:\<path_omitted>\Desktop\python\dir'): // change path
for file in files:
with open(os.path.join(root, file)) as f:
for line in f:
if "string1" in line:
#print file, ": ", line.rstrip()
summary = file, ": ", line.rstrip(),"\r"
myfile.write(''.join(summary))
File contents will be:
>>>
file1.txt: string1 - file1
file1.txt: string1 - file1 (duplicate)
file2.txt: string1 - file2
file3.txt: string1 - file3
>>>