Prev: Need Translation library
Next: Replace and inserting strings within .txt files with the useof regex
From: Νίκος on 9 Aug 2010 10:17 On 9 ÎÏγ, 13:47, Peter Otten <__pete...(a)web.de> wrote: > ÎÎ¯ÎºÎ¿Ï wrote: > > On 9 ÎÏγ, 13:06, Peter Otten <__pete...(a)web.de> wrote: > > >> > So since its utf-8 what the problem of opening it? > > >> Python says it's not, and I tend to believe it. > > > You are right! > > > I tried to do the same exact openign via IDLE enviroment and i goth > > the encoding of the file from there! > > >>>> open("d:\\test\\index.php" ,'r') > > <_io.TextIOWrapper name='d:\\test\\index.php' encoding='cp1253'> > > > Thats why in the error in my previous post it said > > File "C:\Python32\lib\encodings\cp1253.py", line 23, in decode > > it tried to use the cp1253 encoding. > > > But now sicne Python as we see can undestand the nature of the > > encoding what causing it not to open the file? > > It doesn't. You have to tell. Why it doesn't? The idle response designates that it knows that file encoding is in "cp1253" which means it can identify it. *If* the file uses cp1253 you can open it with > > open(..., encoding="cp1253") > > Note that if the file is not in cp1253 python will still happily open it as > long as it doesn't contain the following bytes: > > >>> for i in range(256): > > ...   try: chr(i).decode("cp1253") and None > ...   except: print i > ... > 129 > 136 > 138 > 140 > 141 > 142 > 143 > 144 > 152 > 154 > 156 > 157 > 158 > 159 > 170 > 210 > 255 > > Peter I'm afraid it does because whn i tried: f = open(src_f, 'r', encoding="cp1253" ) i got the same error again.....what are those characters?Dont they belong too tot he same weird 'cp1253' encoding? Why compiler cant open them?
From: Νίκος on 9 Aug 2010 11:58 Please tell me that no matter what weird charhs has inside ic an still open thosie fiels and make the neccessary replacements.
From: Peter Otten on 9 Aug 2010 12:21 Νίκος wrote: > Please tell me that no matter what weird charhs has inside ic an still > open thosie fiels and make the neccessary replacements. Go back to 2.6 for the moment and defer learning about unicode until you're done with the conversion job.
From: Νίκος on 9 Aug 2010 13:40 On 9 ÎÏγ, 19:21, Peter Otten <__pete...(a)web.de> wrote: > ÎÎ¯ÎºÎ¿Ï wrote: > > Please tell me that no matter what weird charhs has inside ic an still > > open thosie fiels and make the neccessary replacements. > > Go back to 2.6 for the moment and defer learning about unicode until you're > done with the conversion job. You are correct again! 3.2 caused the problem, i switched to 2.7 and now i donyt have that problem anymore. File is openign okey! it ALMOST convert correctly! # replace tags print ( 'replacing php tags and contents within' ) src_data = re.sub( '<\?(.*?)\?>', '', src_data ) it only convert the first instance of php tages and not the rest? But why?
From: Νίκος on 9 Aug 2010 15:27 On 8 ÎÏγ, 20:29, John S <jstrick...(a)gmail.com> wrote: > When replacing text in an HTML document with re.sub, you want to use > the re.S (singleline) option; otherwise your pattern won't match when > the opening tag is on one line and the closing is on another. Thats exactly the problem iam facing now with this statement. src_data = re.sub( '<\?(.*?)\?>', '', src_data ) you mean i have to switch it like this? src_data = re.S ( '<\?(.*?)\?>', '', src_data ) ?
First
|
Prev
|
Next
|
Last
Pages: 1 2 3 4 5 Prev: Need Translation library Next: Replace and inserting strings within .txt files with the useof regex |