python – Regular Expressions – testing if a String contains another String – Education Career Blog

Suppose you have some this String (one line)

10.254.254.28 – – 06/Aug/2007:00:12:20 -0700 “GET
/keyser/22300/ HTTP/1.0” 302 528 “-”
“Mozilla/5.0 (X11; U; Linux i686
(x86_64); en-US; rv:1.8.1.4)
Gecko/20070515 Firefox/2.0.0.4”

and you want to extract the part between the GET and HTTP (i.e., some url) but only if it contains the word ‘puzzle’. How would you do that using regular expressions in Python?

Here’s my solution so far.

match = re.search(r'GET (.*puzzle.*) HTTP', my_string)

It works but I have something in mind that I have to change the first/second/both .* to .*? in order for them to be non-greedy. Does it actually matter in this case?

,

No need regex

>>> s
'10.254.254.28 - - 06/Aug/2007:00:12:20 -0700 "GET /keyser/22300/ HTTP/1.0" 302 528 "-" "Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4"'

>>> s.split("HTTP")0
'10.254.254.28 - - 06/Aug/2007:00:12:20 -0700 "GET /keyser/22300/ '

>>> if "puzzle" in s.split("HTTP")0.split("GET")-1:
...   print "found puzzle"
...

,

It does matter. The User-Agent can contain anything. Use non-greedy for both of them.

,

>>> s = '10.254.254.28 - - 06/Aug/2007:00:12:20 -0700 "GET /keyser/22300/ HTTP/1.0" 302 528 "-" "Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4"'
>>> s.split()6
'/keyser/22300/'

Leave a Comment