Help needed tweaking rewrite rules in lieu of htaccess

 
Post new topic   Reply to topic    Aprelium Forum Index -> URL Rewriting
View previous topic :: View next topic  
Author Message
JMMotyer
-


Joined: 06 Jul 2005
Posts: 78
Location: Burlington (Toronto-ish), Ontario, Canada

PostPosted: Thu Aug 04, 2022 9:24 pm    Post subject: Help needed tweaking rewrite rules in lieu of htaccess Reply with quote

Hello again, folks.

A few months ago, the Aprelium administrators & user Horizon introduced me to & helped me with some rewrite rules, and I'm hoping that someone can help me "tweak" those rewrite rules.

I've been hosting a dozen or so sites flawlessly on my X2 installation (currently Windows10 x64) for more than a decade, consisting of simple HTML & PHP scripts & CMSs. Note that ALL of my document roots & aliases are pointing to their separate & unique subfolders. Everything was working flawlessly until a few months ago, when I wished at that time to start using a photo-gallery script that was written for Apache & which needed to use htaccess (I will hereafter refer to that photo-gallery script as X3). As Abyss does not use htaccess, the good folks at Aprelium were able to review the X3 htaccess file & come up with some Abyss rewrite rules that acted as the equivalent of the htaccess file.

The following is an example of those rewrite rules (see the last 4 rules):

In the above screenshot, the first rule I have for all my hosts, which just redirects all non-WW to WWW. The X3 htaccess-equivalent rules are the last 4 rules, and they seem to work perfectly for my 3 sites (hosts). Note that the 2 rules with app/parsers/slir/index.php?$1 look identical, but the 1st one is used for condition REQUEST_FILENAME is not a file, and the 2nd one is used for condition QUERY_STRING matches with ^(?)debug($|&).

For those three X3 sites (hosts), their respective document roots point directly to their respective X3 subfolders, and as the X3 photo-galleries are basically the only thing running on those sites, the 4 rewrite rules work perfectly. User Horizon was able to help me with the other 3 rewrite rules, for my guestbooks & webstats scripts :-).

Now, my reason for wanting to tweak those X3 rewrite rules is, I wish to add that X3 photo-gallery script to some of my normal hosts, which have their root directories pointing to, for example, my CMS (PHP) that I am running.

I have setup a test host (https://www.AuroraWings.me), with the CMS installed & running, as well as a few PHP scripts running by use of aliases, and everything works perfectly. I then installed the X3 photo-gallery script into its own subfolder, and created an alias that points to that subfolder. My problem is that the 4 rewrite rules that I am trying to use (below) are not working:


The following is a sample of my directory structure (example-only):
    d:/AuroraWings/cms (this is this host's document root)
    d:/AuroraWings/cpg (an alias on this host points to this folder)
    d:/AuroraWings/forums (an alias on this host points to this folder)
    d:/AuroraWings/gbook (an alias on this host points to this folder)
    d:/AuroraWings/guestbook (an alias on this host points to this folder)
    d:/AuroraWings/webstats (an alias on this host points to this folder)
    d:/AuroraWings/x3 (an alias on this host points to this folder)

I am hoping that someone can think of a way, that I can have those 4 (or more) htaccess-equivalent rewrite rules ONLY come into effect when accessing /x3.

Thank you all in advance, and have a great day.

Regards,
John
Back to top View user's profile Send private message Visit poster's website
Horizon
-


Joined: 18 Feb 2022
Posts: 61

PostPosted: Fri Aug 05, 2022 6:19 am    Post subject: Reply with quote

Hello,
I think that there is a RegEx and a rewrite rules ordering problem in these screenshots.

First you need to give a lower priority to all rules that match everything and rules relative to the base /.

The mose precise regex / relative to base rules must have priority (higher in the list) than the rules that match things more widely.

But before I get into the details, a few advices:

1.

REQUEST_FILENAME is to be used when you want to delegate the request to Abyss itself (and not any script interpreter).

So it makes sense to use REQUEST_FILENAME when checking if a file exists, but it does not make sense to use it as a variable for internal redirects to a PHP script.

REQUEST_URI in to be used as-is (in URI format, HTML-encoded such as %2F or %20) when you want to give the variable to a PHP script anyway, for example.

It also has good use for verifying the text of the request URI itself rather than checking it against real server files.

REQUEST_URI is to be used typically by scripting interpreters like PHP.

2.

In one of your RegEx's you used the question mark as-is.

The queqtion mark ? has special significance is RegEx so if you try to directly use it without escaping it you will get unpredictable behavior in your scripts.

It should be escaped if you want to use it literally:
\?

3.

In your first screenshots your gbook, guestbook and webstats rules match only exactly literally.

I don't remember, but was it what you wanted?

Would you prefer using something like this instead (?):
^/(gbook|guestbook|webstats)(|/.*)$

The redirect would be:
/$1/index.php?$2

4.

The rule below webstats in your first screenshots checks for html, json, xml, atom & rss files to redirect to a php script.

Did you actually want to redirect only these extensions?
Or was your intent to redirect non-php files?

If yes, would this matching be good for your intent (?):
(.+?)(?!\.php$|\.php\?)(\..+?)$

Anyway there were perhaps problems with your RegEx:

The redirect only included the filename (or rather, full URI) $1 but not the extension $2.

You also seem to check for slashes (/) but both rule 5 and rule 8 of your first screenshot but you include it in the redirect anyway.

Because if you capture (.+) at the beginning without ^, and if you capture ^(.*), it will capture / anyway in both cases (?).

5.

When checking QUERY_STRING you don't need to include the question mark ?.

It's already isolated to only take for example arg=value&result=true.

You also basically used it in ^(?)debug($|&) which means here 'empty non-greedy capturing group'.

This did not do anything, since you just captured in RegEx non-greedy mode, empty text.

I think that you actually wanted this instead:
^(?:|.*?&)debug(?:\=.*|&.*|$)$

This will capture a debug query argument also when it's in the form debug=true for example, but it cans capture it even when only like 'debug' without 'true'.

And, you don't seem to forward this debug argument anywhere in your redirect.

You didn't capture it like as in '(debug)' so the redirect script probably will not know that you want debugging in the end.

Perhaps you actually need this instead:
^(?:|.*?&)(debug)(?:\=.*|&.*|$)$

But still yet, where do the /slir scripts go?

What I see is that you used $1 in the redirect for ^render/. but this is a RegEx capture variable - I don't see any RegEx capture group in these two rewrite rules.

So, perhaps this instead:
^render($|/.*)$

If you use this then the non-debug has just $1 and the debug rewrite rule has both a $1 (past render/) and a $2 that will be filled with the word 'debug' if the debug rewrite rule takes effect, that you will need to pass to your slir php script.

--

Now getting into the details, I see probably a big problem in your last screenshot.

For the aurorawings hostname, does the rewrite rule only activate if the HTTP_HOST is not already www.aurorawings.me?

Otherwise the rules below will never activate and since it's an external redirect you will enter a redirection loop.

Another technical picky detail is the order of the Abyss Web Server modules.

Reverse-Proxy has priority over URLRewrite for example.

Aprelium developers would need to clarily the exact Abyss Web Server modules priority order.

By any chance, since you use directory aliases, do they not work because Directory Alias has higher priority / precedence over URLRewrite?

And since it maps to real server directories directly, then no further internal request exists anymore for URLRewrite and the PHP script if needed is run directly without URLRewrite.

There are many points in your rewrite rules that are troubling due to not knowing what the original htaccess files were.

Do you have the original htaccess files for all the php scripts that you use?

Most of the rewrite rule problems stem from either a RegEx or a rule priority order problem.

If you replace Directory Alias rules with equivalent URLRewrite ones, you could solve the module priority order problem.
Back to top View user's profile Send private message
JMMotyer
-


Joined: 06 Jul 2005
Posts: 78
Location: Burlington (Toronto-ish), Ontario, Canada

PostPosted: Fri Aug 05, 2022 7:45 am    Post subject: Reply with quote

Wow, thanks for getting back to me so fast :-). I will try to clarify what the problem is, and what I am hoping to be able to accomplish.

Firstly, I have 19 hosts in total. 15 of those hosts have a single rewrite rule, the one that redirects all non-WWW traffic to WWW. All of those 15 hosts also use aliases, most of those hosts with only about 4 aliases, but a few of those 15 hosts use up to 25 aliases each ;-). The aliases have all worked flawlessly for over a decade.

Where I needed to start using additional rewrite rules beyond the WWW rule, is for 3 additional hosts, which use that X3 photo-gallery script, which was written for Apache, and which uses htaccess files to operate. Aprelium support came up with the following for those three X3 hosts, and which have worked flawlessly for the last 6 months or so:


The above is a cut & paste of a screenshot, hence the shadings aren't correct ;-).

Now, a couple of months ago, I had wanted to be able to use some other PHP scripts (via aliases) with those three X3 hosts, and you came up with the following (the 3 Global rules), which has also worked flawlessly these last couple of months:


But the above rewrite rules work ONLY for the three X3 hosts, which point their document roots directly to their respective X3 folders.

What I wish to be able to do, is to take an existing host which uses a single rewrite rule (for WWW) & which currently has a couple of dozen aliases, and have that host also be able to use that X3 photo-gallery script. That 2nd screenshot in my original post was just me trying to get the rewrite rules to work for this particular host, but I have no idea what I was doing ;-).

I will look through your reply in the morning, and possibly some of it will make sense to me. Thank you for taking the time to reply, and have yourself a great day.

Regards,
John
Back to top View user's profile Send private message Visit poster's website
JMMotyer
-


Joined: 06 Jul 2005
Posts: 78
Location: Burlington (Toronto-ish), Ontario, Canada

PostPosted: Sat Aug 06, 2022 6:42 am    Post subject: Reply with quote

Hi again, Horizon.

Your items # 1 & 2:

The 4 rewrite rules with a ? (question mark) were give to me by Aprelium Support, and are for my use as a workaround only on 3 of my Abyss hosts because Abyss does not use .htaccess, and .htaccess is required for my X3 photo-gallery script to work.

The 4 rewrite rules for that X3 photo-gallery script work perfectly, as long as the document root for each of the three X3 hosts point directly to their respective X3 folders.

Again, note that those 4 rewrite rules are used solely with my 3 Abyss hosts on which my X3 photo-gallery script is in use... they are not used on any of my other Abyss hosts. Two of my X3 photo-gallery sites, by the way, are:

That .htaccess that is required for the X3 photo-gallery script to work, I've zipped it up HERE , incase you wish to see it. That is what Aprelium Support referred to, when they came up with the 4 equivalent rewrite rules that I am using.

Your item # 3:

The reason for those 3 global rewrite rules, is so that those 3 aliases (/gbook, /guestbook and /webstats) work on the 3 hosts that are using the X3 photo-gallery script. Without those 3 rewrite rules, I had to enter in the complete URL with filename. Example:

Without those 3 global rewrite rules, I would have to include the filename:

With those 3 global rewrite rules, I can enter in the following, without the filename:

Your items # 4 & 5:

Regarding the suggestions that you give, I have to mention that although I understand the purpose of rewrite rules, I in fact know absolutely nothing about the symbols used in rewrite rules, nor am I able to begin to comprehend the logic about how they work.

Perhaps that X3 .htaccess file (again HERE) will make sense of what I currently have for those 4 specific X3 photo-gallery rewrite rules.

Quote:
For the aurorawings hostname, does the rewrite rule only activate if the HTTP_HOST is not already www.aurorawings.me?

Otherwise the rules below will never activate and since it's an external redirect you will enter a redirection loop.
Correct...


Quote:
By any chance, since you use directory aliases, do they not work because Directory Alias has higher priority / precedence over URLRewrite?

And since it maps to real server directories directly, then no further internal request exists anymore for URLRewrite and the PHP script if needed is run directly without URLRewrite.
My aliases have worked flawlessly since I started using Abyss more than 17 years ago, and continue to work flawlessly :-), with all my aliases configured under Aliases for their respective hosts, and NOT under URL Rewriting.

However, for my three hosts on which I am using X3 photo-gallery, and who's document roots point directly to their respective X3 folders, none of my existing aliases for those hosts work, hence me needing to add those 3 Global rewrite rules (below) in order to get those 3 aliases to work for those three X3 hosts:

All that I am trying to do is, on my MAIN site which currently has about 25 aliases configured, and which currently has just a single rewrite rule (the rule that redirect non-WWW to WWW), for me to be able to run that X3 photo-gallery as well, by way of an alias pointing to the /X3 folder, and whatever additional rewrite rules that would be required, hence me experimenting with that screenshot showing the 4 (type) relative to base and (base virtual path) of /x3 (which is its alias). And rather than mess with my main site, I am temporarily using my AuroraWings test site.

Sorry to be so long-winded, and most-likely have confused you more than ever.
Back to top View user's profile Send private message Visit poster's website
Horizon
-


Joined: 18 Feb 2022
Posts: 61

PostPosted: Sat Aug 06, 2022 1:59 pm    Post subject: Reply with quote

Hello,

I got this for the .htaccess translation of the X3 gallery script:

1. Set this environment variable in your Abyss Web Server hostname configuration to an empty value:

HTTP_MOD_REWRITE

Your original htaccess wants it to detect mod_rewrite in Apache.

2. You can hide X3 diagnostics information from visitors by setting this environment variable to 'On':

X3_HIDE_DIAGNOSTICS

Additional environment variables can be added in the interpreter settings where PHP is declared.

3. The X3 htaccess recommends disallowing directory listing, so you should do it.

4. If these file extensions (mime types) don't match these values for your Abyss Web Server host entry where X3 is installed, you need to rectify them to be this:

json:
application/json
js:
application/javascript
svg:
image/svg+xml
webp:
image/webp
mp4:
video/mp4

In the server mimetypes configuration it's not a problem if there's more than one file extension associated to a mimetype.

It's just that each mimetype above needs to atleast have the specified file extensions associated.

For example, application/json needs to have atleast json associated to it.

5. Probably because digital media is already compressed, your X3 script recommends only using gzip compression for the following mimetypes:

- application/javascript
- application/json
- application/xml
- image/svg+xml
- text/css
- text/html
- text/xml

So you could enable compression in Abyss Web Server for your host entry running X3 such that only these mimetypes get compressed when sent to visitors.

6. This X3 script instructs web browsers to disable mimetype-sniffing.

mimetype-sniffing means that web browsers try to determine the file type themselves instead of listening to your server.

You can tell web browsers not to do this by adding this HTTP header in Abyss Web Server for your host running the X3 gallery script:

X-Content-Type-Options

The value needs to be:
nosniff

This extra HTTP response header should be set in your hostname entry's General - Advanced Settings section.

7. While you're in the General - Advanced Settings section of your hostname running the X3 gallery, you can also replicate these file expiry time rules in the original htaccess (labeled 'File Expiration Times'):

text/xml:
0 seconds
Relative to current time

text/xml:
3600 seconds
Relative to current time

application/xml:
3600 seconds
Relative to current time

The 'access' time used in htaccess is equivalent to current time in Abyss Web Server.

8. I personally think that you shouldn't use the default htaccess's 10-years caching period, since if you later want to modify your pictures in the server directory, web browsers theorically would not pick up your modifications until 10 years or when visitors manually clear their web browser cache.

9. Now for what I understood of the htaccess rules:

(you will need to put the x3 folder in the root directory of the host for www.aurorawings.me without using aliases)

Rule 1: ---------------

If a request for a file with the extensions html, json, xml, atom or rss is going to point to a non-existing file, then internally redirect the request to a php script without the requested file extension (to find a folder with the same filename).

The trailing slash '/' is used to require a folder instead of a file.

-----------------------
Code:
- Relative to base:
  /x3

- Apply to requests mathing:
  ^(.+)\.(?:html|json|xml|atom|rss)$

- Condition:
  REQUEST_FILENAME is not a file

- Redirect to:

  Internal redirect:
  index.php?$1/

- Next action:
  Stop matching

Rule 2: ---------------

Redirect calls targeting the /render directory to the X3 image resizer script.

This happens if the file does not exist or if the query string contains the word 'debug'.

Without the 'debug' word, if the file exists the 'render/' requests are not redirected.

-----------------------
Code:
- Relative to base:
  /x3

- Apply to requests mathing:
  ^render/(.+)$

- Conditions (if any of them matches):

  REQUEST_FILENAME is not a file

  QUERY_STRING matches with:
  ^(?:.*?&|)debug(?:\=.*|&.*|$)$
   
- Redirect to:

  Internal redirect:
  app/parsers/slir/index.php?$1

- Next action:
  Stop matching

Rule 3: ---------------

If a request points to a file or to a directory that doesn't exist, redirect it to the X3 gallery script's index.php file.

The second parenthesis group accounts for folders with the trailing slash '/'.

-----------------------
Code:
- Relative to base:
  /x3

- Apply to requests mathing:
  ^(.*?)(/|)$

- Conditions (all of them must match):

  REQUEST_FILENAME is not a file
  REQUEST_FILENAME is not a directory
   
- Redirect to:

  Internal redirect:
  index.php?/$1$2

  Options:
   - Append query string

- Next action:
  Stop matching

---------------

Note that in all these rewrite rules, none of them have the 'Apply to subrequests' option enabled.

Subrequests is for when URLRewrite does a redirect and its further internal requests should also go through itself again.

For the redirect to www.aurorawings.me you can also modify it to be this way:

---------------
Code:
  Relative to base:
    /

  Apply to requests matching:
    .*

  Condition (all must match):

  1.
    SCRIPT_URI matches with:
    ^http(s|)\://[^/]+?(/|$)(.+|$)$

  2.
    HTTP_HOST matches with:
    ^(?!^\.)(?!.+?\.[^\.]+?\.(?<!\.$))(.+?)(?:\.+?$|)$

  Action:

    External redirect to:
    http$1://www.$4$2$3

  * Explanations:
  * $1: 's' or empty for https,
  * $2: '/' or empty if not found,
  * $3: the rest of the URL after a '/' if found, or empty if not,

  * $4: the hostname captured in HTTP_HOST.

  Next action:
    Stop matching

  Options:
    - Append query string

Of course, all this RegEx recursion of 'must be something that must not be X thing and must be Y' surely looks like hazardeous wizardry.

The only unsure detail is whether SCRIPT_URI 100% really contains the full URL with the http(s) part.

The Abyss Web Server manual says that it does.

I didn't have any time to verify it myself.

However, considering that your photo gallery script probably needs to handle edge cases for the hostname redirect to www versions, I took the time to come up with the monstrous RegEx above.

A RegEx monstrosity, not because of actual complexity, but rather monstrous for our brains since it's basically some 'it must not be something that must not be... that must not end with...':

^(?!^\.)(?!.+?\.[^\.]+?\.(?<!\.$))(.+?)(?:\.+?$|)$

If the hostname that ultimately follows correctly matches the below requirements :

- MUST NOT begin with a dot,

- and MUST NOT begin with one or more characters followed by a dot (that if found will stop searching it further at the first occurence of it),

- followed by one or more non-dot characters followed by a dot, (that if found will stop searching it further at the first occurence of it),

- and if we found a valid 'sub
.domain.' beginning, then it MUST NOT actually END with a dot.

- if the domain IS like 'domain
.com.' then we DO make the RegEx valid and we will capture the hostname - but you will see later - we will NOT capture the end dots, no matter the number of dots there could be at the end.

* because, we want to capture hostnames we deem invalid (without subdomain), not the correct ones with a subdomain.

* good domain formats like 'sub
.domain.com' will not match the RegEx at all and the rule will not activate.

- once we confirmed that we DO NOT have a valid subdomain there, therefore we have only a 'domain
.com' hostname, then we will capture all the text of the hostname and put it in a RegEx variable.

- HOWEVER, since some people sometimes write like 'domain
.com.' (which is actually also valid...), then any number of trailing dots at the very END of that hostname will NOT be captured in the RegEx variable.

For a start, I think that you should try running a separate hostname as you did, but with only the X3 gallery script - and only the x3 gallery URL rewrites with perhaps if you wish the www-redirect rewrite (for the tests you could temporarily directly connect to the www version).

If you keep the www-rewrite during the tests, then put it at the top of the urlrewrite rules list.

I think that the rewrite rules tweaks, which aren't exact conversions of the htaccess ones (I tried some RegEx bugfixes for some that perhaps were causing problems), should work atleast when using only the x3 gallery on your test server.

I hope that it will work, and I do think it should actually work.
Back to top View user's profile Send private message
JMMotyer
-


Joined: 06 Jul 2005
Posts: 78
Location: Burlington (Toronto-ish), Ontario, Canada

PostPosted: Mon Aug 08, 2022 2:57 am    Post subject: Reply with quote

Note that I've changed the domains that I've been using for testing:

    PanAurora-AI.com (combined website, with its host's document root pointing directly to the CMS, and with an alias pointing to the X3 folder)
    PanAurora-AI.com/x3 (URL to the X3 installation on the combined website, with its host's document root pointing directly to the CMS, and with an alias pointing to the X3 folder... and with your latest X3 rewrite rules)
    PanAurora-AI.net (stand-alone X3 photo-gallery, with its host's document root pointing directly to the X3 folder... this is what the initial screen should look like)

Regarding your 4th suggestion in your previous post:
Quote:
Code:
Relative to base:
/

Apply to requests matching:
.*

Condition (all must match):

1.
SCRIPT_URI matches with:
^http(s|)\://[^/]+?(/|$)(.+|$)$

2.
HTTP_HOST matches with:
^(?!^\.)(?!.+?\.[^\.]+?\.(?<!\.$))(.+?)(?:\.+?$|)$

Action:

External redirect to:
http$1://www.$4$2$3

* Explanations:
* $1: 's' or empty for https,
* $2: '/' or empty if not found,
* $3: the rest of the URL after a '/' if found, or empty if not,

* $4: the hostname captured in HTTP_HOST.

Next action:
Stop matching

Options:
- Append query string

In the below image, for the rewrite rule to direct all non-WWW traffic to WWW, the screenshot on the left is what I have been using for many months & which has worked perfectly (and which was suggested by yourself back this past March, thank you again)... and the screenshot on the right is using your above suggestion for your #4 rewrite rule:

When I change the rewrite rule to direct all non-WWW to WWW using your settings on the right in the above screenshot, the URL changes to (in the below image):

...which you alluded to here:
Quote:
For the aurorawings hostname, does the rewrite rule only activate if the HTTP_HOST is not already www.aurorawings.me?

Otherwise the rules below will never activate and since it's an external redirect you will enter a redirection loop.

The same loop happened regardless if I had that WWW rewrite rule in last place or first place, so I've reverted back to my original WWW rewrite rule that you provided me with this past March.

My current settings:

.....

Horizon, please don't waste any more of your time with this rewriting request, and please consider this post closed. I'm so frustrated with this Forum, due the forums here freezing for the past few months for a couple of hours, and tonight after many hours of testing & screenshots here in this reply, these forums hung yet again, and I lost everything that I wrote the past few hours about my testing with screenshots.

The X3 photo-gallery script that I'm using, was written by a developer who perhaps is not talented enough to write a proper photo-gallery script for anything other than Apache, but until I can find something equivalent that will work with ALL websites & not just Apache, I'm stuck with it. So going forward, I will just stick with running my few X3 photo-galleries on their own respective domains/hosts.

I sincerely do appreciate your help to date & your suggestions, but I do not want you to waste any more of your time on this.

Regards,
John
Back to top View user's profile Send private message Visit poster's website
admin
Site Admin


Joined: 03 Mar 2002
Posts: 1347

PostPosted: Mon Aug 08, 2022 9:13 pm    Post subject: Reply with quote

JMMotyer,

Have you considered contacting our technical support by email? We're here to offer another "channel" of help. Sharing your abyss.conf file with us can save you a lot of screenshot taking time.
_________________
Follow @abyssws on Twitter
Subscribe to our newsletter
_________________
Forum Administrator
Aprelium - https://aprelium.com
Back to top View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Aprelium Forum Index -> URL Rewriting All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB phpBB Group