Posts

Showing posts with the label unicode

Finding different type of single quotes in javascript

Finding different type of single quotes in javascript My requirement: There are three strings with single quotes of different types(like -> '‘’) and when I search with a single quote(i.e -> ') I should get the index of the single quote in all the cases. e.g: var str1 = "this is someone's question" var str2 = "this is someone‘s question" var str3 = "this is someone’s question" str.indexOf("'"); -> This statement should find single quotes (', ‘ and ’) in all the three variables. Just like the google chrome find works in web page search. Thanks in advance str.indexOf("'"); they're different characters. you will need 3 statements or regex – Philipp Sander Jun 29 at 11:03 Google likely either normalizes this, or just searches for t...

Python and BeautifulSoup encoding issues

Python and BeautifulSoup encoding issues I'm writing a crawler with Python using BeautifulSoup, and everything was going swimmingly till I ran into this site: http://www.elnorte.ec/ I'm getting the contents with the requests library: r = requests.get('http://www.elnorte.ec/') content = r.content If I do a print of the content variable at that point, all the spanish special characters seem to be working fine. However, once I try to feed the content variable to BeautifulSoup it all gets messed up: soup = BeautifulSoup(content) print(soup) ... <a class="blogCalendarToday" href="/component/blog_calendar/?year=2011&amp;month=08&amp;day=27&amp;modid=203" title="1009 artículos en este día"> ... It's apparently garbling up all the spanish special characters (accents and whatnot). I've tried doing content.decode('utf-8'), content.decode('latin-1'), also tried messing around with the fromEncoding parameter...

Keyman developer 10 won't match rules in Odia script

Keyman developer 10 won't match rules in Odia script I'm making a custom keyboard for Oriya/ Odia script with Keyman developer 10 but it won't do contextual substitutions when all the input is in Odia script. For example + [K_K] > U+0B15 + [K_T] > U+0B24 U+0B15 + U+0B24 > U+0B15 U+0B4D U+0B24 "a" + "b" > U+0B15 U+0B4D U+0B24 U+0B15 + [K_C] > U+0B15 U+0B4D U+0B24 When I test his, I get the desired output when I type 'ab' or 'kc' but not with 'kt'. Any help to explain why line 3 won't work but line 4 does will be appreciated. I do get this error sometimes when Targets is set to 'any' rather than 'windows' warning 209A: The rule will never be matched because its key code is never fired. 1 Answer 1 The reason this isn't working is you are trying to match on a Unicode value instead of a keystroke on line 3: U+0...