+ Post New Thread
Page 1 of 2 12 LastLast
Results 1 to 15 of 20
Web Development Thread, Trim URL with PHP in Coding and Web Development; I have a form where a user can submit a web address, in the following format: Code: http://www.site1.com http://prefix.site1.com http://123.site1.com ...
  1. #1

    Hightower's Avatar
    Join Date
    Jun 2008
    Location
    Cloud 9
    Posts
    4,920
    Thank Post
    494
    Thanked 690 Times in 444 Posts
    Rep Power
    241

    Trim URL with PHP

    I have a form where a user can submit a web address, in the following format:

    Code:
    http://www.site1.com
    
    http://prefix.site1.com
    
    http://123.site1.com
    I want to trim the url so it looks like this:

    Code:
    site1.com
    I have used parse_url with PHP_URL_HOST which gives me

    Code:
    www.site1.com or
    123.site1.com
    I just need to remove the further 'www' or '123' or whatever may be there in the prefix for that matter.

    Any ideas?

  2. #2

    CESIL's Avatar
    Join Date
    Nov 2006
    Location
    Hampshire
    Posts
    1,404
    Thank Post
    109
    Thanked 267 Times in 198 Posts
    Rep Power
    168
    you can do a string search for the first occurrence of the . and trim from ther to the end of the string

    the strpos function will find the first dot and the value returned can be used with substr to return the bit you want

    try $stripped = substr($url,strpos($url,".")+1)

  3. #3

    Join Date
    Sep 2006
    Location
    West Midlands
    Posts
    410
    Thank Post
    73
    Thanked 75 Times in 58 Posts
    Rep Power
    44
    Just be careful as to what format you think a URL should appear (before trying to parse it).

    How to Obscure Any URL makes interesting reading!

    mb

  4. #4

    Join Date
    Jan 2007
    Location
    Lowestoft, Suffolk
    Posts
    84
    Thank Post
    6
    Thanked 4 Times in 4 Posts
    Rep Power
    16
    How about explode(".",$url)?

    I too, thought of strrpos to search backwards, but .coms will have 1 dot and co.uks etc will have 2.
    I've struggled with this for a content management routine I provided for a friend of mine. It's tough to parse his updates and find the urls and make them active when displayed on the site.

    Have you looked at the full array returned by parse_url, the manual shows a lot of information can be returned...

    Array
    (
    [scheme] => http
    [host] => hostname
    [user] => username
    [pass] => password
    [path] => /path
    [query] => arg=value
    [fragment] => anchor
    )

  5. #5

    CESIL's Avatar
    Join Date
    Nov 2006
    Location
    Hampshire
    Posts
    1,404
    Thank Post
    109
    Thanked 267 Times in 198 Posts
    Rep Power
    168
    you could us substr_count to see how many dots there are and then decide where to truncate

  6. #6

    Hightower's Avatar
    Join Date
    Jun 2008
    Location
    Cloud 9
    Posts
    4,920
    Thank Post
    494
    Thanked 690 Times in 444 Posts
    Rep Power
    241
    Hmmm, doesn't seem like a simple fix then?

  7. #7
    Gerry's Avatar
    Join Date
    Jun 2007
    Location
    North Wales
    Posts
    431
    Thank Post
    60
    Thanked 38 Times in 35 Posts
    Rep Power
    24
    strpos() will find the 1st occurrence, so if you only want to remove the prefix, it should work fine.

    http://www.w3schools.com/PHP/func_string_strpos.asp

  8. #8

    CESIL's Avatar
    Join Date
    Nov 2006
    Location
    Hampshire
    Posts
    1,404
    Thank Post
    109
    Thanked 267 Times in 198 Posts
    Rep Power
    168
    strpos was my suggestion...seems easy enough to me

  9. #9

    Hightower's Avatar
    Join Date
    Jun 2008
    Location
    Cloud 9
    Posts
    4,920
    Thank Post
    494
    Thanked 690 Times in 444 Posts
    Rep Power
    241
    Quote Originally Posted by Gerry View Post
    strpos() will find the 1st occurrence, so if you only want to remove the prefix, it should work fine.

    PHP strpos() Function
    But in the instance when somebody submits
    Code:
    http://google.com
    what will happen? I'll be left with
    Code:
    .com
    won't I?

  10. #10

    Join Date
    Aug 2005
    Location
    London
    Posts
    3,154
    Thank Post
    114
    Thanked 527 Times in 450 Posts
    Blog Entries
    2
    Rep Power
    123
    What I started thinking was that you need to make sure that there are at least 2 full stops left before you remove anything - that deals with "google.com" - it has 1 full stop, you don't strip any more. Sadly, it fails with bbc.co.uk - it has 2 full stops so you remove to get co.uk

    The only thing I can think of is to work back from the end and do an nslookup on each name that you get; once you can resolve, you know you've got a name.

    eg - you're given Google so you try "com" and it fails; you try "google.com" and it works

    Not sure how you do name lookups in PHP (I can do it in ASP.Net but that doesn't really help :-)) but I'm sure it must be there (everything else is)

  11. #11

    Join Date
    Jan 2007
    Location
    Lowestoft, Suffolk
    Posts
    84
    Thank Post
    6
    Thanked 4 Times in 4 Posts
    Rep Power
    16
    use strrpos, that counts backwards and only slice if you find >1 if .com or >2 if co.uk etc. That's the problem I found, what is the domain and how many dots does it contain?

  12. #12

    webman's Avatar
    Join Date
    Nov 2005
    Location
    North East England
    Posts
    8,406
    Thank Post
    639
    Thanked 961 Times in 661 Posts
    Blog Entries
    2
    Rep Power
    324
    I'm wondering why you would want to do this in the first place - mainly for the reasons mentioned. www.site.com might be the only hostname that works - site.com might not have an appropriate record configured. And what has also been said is that it is very hit and miss as to the results you will get back from splitting, exploding, strpos'ing etc.

  13. #13

    Hightower's Avatar
    Join Date
    Jun 2008
    Location
    Cloud 9
    Posts
    4,920
    Thank Post
    494
    Thanked 690 Times in 444 Posts
    Rep Power
    241
    Quote Originally Posted by webman View Post
    I'm wondering why you would want to do this in the first place
    I have a database that staff can add sites to, by giving an address. The site is stored exactly as they enter it
    Code:
    www.bbc.co.uk/sport/blah
    and then is sent to a text file stripped of any rubbish

    Code:
    bbc.co.uk
    so that our whitelist unblocks it. The reason I want to unblock everything for bbc.co.uk is because styles and images are stored under a different prefix (as there are for many sites nowadays) and without this the sites dont display properly.

  14. #14

    powdarrmonkey's Avatar
    Join Date
    Feb 2008
    Location
    Alcester, Warwickshire
    Posts
    4,859
    Thank Post
    412
    Thanked 777 Times in 650 Posts
    Rep Power
    182
    This is dead easy with a regular expression. See Split an URL into protocol, site, and resource parts - PHP - Snipplr for an example.

  15. #15

    CESIL's Avatar
    Join Date
    Nov 2006
    Location
    Hampshire
    Posts
    1,404
    Thank Post
    109
    Thanked 267 Times in 198 Posts
    Rep Power
    168
    ah so now you have changed the question...

    you originally said urls were entered as www.site.com etc so counting from the front would work...

SHARE:
+ Post New Thread
Page 1 of 2 12 LastLast

Similar Threads

  1. URL Authentication
    By kerlj001 in forum EduGeek Joomla 1.0 Package
    Replies: 4
    Last Post: 10th December 2008, 09:08 PM
  2. What is this URL?
    By kennysarmy in forum Windows
    Replies: 4
    Last Post: 16th January 2008, 12:21 PM
  3. URL Filter
    By richard.thomas in forum Network and Classroom Management
    Replies: 3
    Last Post: 2nd November 2007, 10:27 AM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •