I try to make the router skip expressions, with letters and symbols, such as a-zA-Z0-9А-Яа-я_ . How I do it:

 rootApp.all(/\+(\w+|\p{L}+)/, function (req, res, next) { res.cookie("referer", req.params[0]); res.send(req.params); }); 

But in the end ... GET /+А - 404 NOT FOUND GET /+ref - {"0":"ref"} GET /+ref_123ру_ - {"0":"ref_123"} How to make the parser skip the Cyrillic alphabet?

UPDATE : It turns out that the link is encoded by URL characters. Even if you decode it, still the parsing does not work.

  rootApp.all(/\+(\S)/, function (req, res, next) { var decodedPath = decodeURIComponent(req.path), regexp = /\+(\w+|\p{IsCyrillic}+|\p{L}+)/; console.log(decodedPath); if (regexp.test(decodedPath)) { var parsed = regexp.exec(decodedPath); console.log("Parsed String: ", parsed); res.cookie("referer", parsed[1]); res.send(parsed); } else next(); }); 

Console

 /+ref_123ру_ Parsed String: [ '+ref_123', 'ref_123', index: 1, input: '/+ref_123ру_' ] 
  • So add Cyrillic. But it seems to me that most likely you need to add% since url strings encode it in% - nick_n_a
  • So there is Cyrillic. In the first query, this is the Russian "A", in the third "ru". - blits
  • Different programs with Cyrillic work in different ways. The Cyrillic is encoded by regex [A-Yaa-I], but it is for the URL that the URLs can be http: //% D1% 8F% D0% BD% D0% B4% D0% B5% D0% BA% D1% 81. % D1% 80% D1% 83 (Yandex.Ru is simple coding), plus unicode% with 4-digit hex. Try adding%. - nick_n_a
  • @nick_n_a I updated the answer, look. - blits
  • \p{IsCyrillic} not supported by RegExp. Like all other Unicode classes starting with \p . - Wiktor Stribiżew

1 answer 1

In the end, I did this:

 rootApp.all(/\+(\S+)/, function (req, res, next) { var decodedPath = decodeURIComponent(req.path), regexp = /\+[a-zA-Z0-9А-Яа-я_]+/; var executed = regexp.exec(decodedPath); if (executed != null) { var parsed = executed[0].slice(1); res.cookie("referer", parsed); res.send(parsed); } else next(); }); 

Thanks to all!

  • decodeURIComponent - makes Russian characters out of% D1. According to the idea /\+[a-zA-Z0-9%]+/ should also work. - nick_n_a