Skip to main content

Thread: Extract Domain Names Using sed


i have several large text files around million domain names. want create new text file unique domain names in.

i've got far:

code:
sed -nr "s|(http://)?(www\.)?([^.]*)\.(.*\.?)*|\3|p" filename
test case:
code:
google.com.tk/1/2/3/ www.google.co.uk.se google.co.au www.google.co.uk.se/1/2/ m.google.com http://www.google.com/au
above command produces:

google
google
google
google
m
google
i want produce:

google.com.tk
google.co.uk.se
google.co.au
google.co.uk.se
m.google.com
google.com

code:
sed -nr "s|(http://)?(www\.)?([^/]+).*|\3|p" filename


Forum The Ubuntu Forum Community Ubuntu Specialised Support Development & Programming Programming Talk Extract Domain Names Using sed


Ubuntu

Comments

Popular posts from this blog

Some mp4 files not displaying correctly (CS6)

Thread: Samba is not authenticating with LDAP