Character encoding in Drafts and Templates won't "stick"

I've been using T'bird for many years, am using now Version 30.0 with Windows 8. Over the years I have wrestled extensively with character encodings, because I do a lot of messages in mixed English and Cyrillic. I have things set now so that my default encoding is Cyrillic (Windows 1251) and this works pretty well, EXCEPT!! that when I save a message in my Drafts folder (or any of its subfolders), or my Templates folder, two things happen:
1. the Cyrillic goes to garbage (I can sometimes recover this by physically moving the message to the Inbox)
2. I get a series of Â and/or Ã characters, with and without spaces between.
This is a major pain in the butt! If I catch it the first time around, and move the offending message to the Inbox and "Edit as new", I can usually rescue the cyrillic - but if I (for example) make some changes elsewhere in the message in English portions and save it again (without noticing the mess-up of the cyrillic), it's all lost and I've found no way to recover it.
I've been reading this forum and I suspect that my problem lies in the "folder properties", and/or maybe that I have not set up a User-defined set of display properties. However, I'm afraid to just mess around with things (as I've done so much in the past) for fear of messing up messages that I've been keeping for a long time in the OTHER folders.
Help please? I use Windows 7.

Didn't understand from what starting point you want me to "select the mail account name", but closed and re-opened T'bird.
Then I started to run your tests.
To my HUGE AMAZEMENT!!! - everything seems to be working today!!
This leaves me with a couple of old files that seem to have gotten trashed a long time ago and won't simply "revover" - but I did keep back-up copies of them elsewhere, and can now go about rebuilding them if necessary.
Thank you for your help - whatever it is that I did, seems to have worked. After how many years?! If I have trouble in the future, I'll be back - but for right now, what a RELIEF!
Best, Martha

Similar Messages

Character Encoding for JSPs and HTML forms

After having read loads of postings on character encoding problems I'm still puzzled about the following problem:
I have an instance (A) of WL 8.1 SP3 on a WinXP machine and another instance (B) of WL 8.1 without any SP on a Win2K machine. The underlying Windows locale is english(US) in both cases.
The same application deployed as a war file to these instances does not behave in the same way when it comes to displaying non-Latin1-characters like the Euro symbol: Whereas (A) shows and accepts these characters as request-parameters, (B) does not.
Since the war file is the same (weblogic.xml, jsps and everything), the reason for this must either be the service-pack-level or some other configuration setting I overlooked.
Any hints are appreciated!

Try this:
Prefrences -> Content -> Fonts & Color -> Advanced
At the bottom, choose your Encoding.

Seeing � etc despite having View--Character encoding as unicode and auto-detect universal

On viewing some web pages see characters such as �, , (for example). But View-Character Encoding is set at Unicode (UTF-8) or Western (ISO8859-1) and Tools-Options-Content-Fonts-Advanced Encoding set with either of those

example of page:
http://scienceofdoom.com/2010/09/17/on-missing-the-point-by-chilingar-et-al-2008/
- a little over half way down, the section headed "Anthropogenic Imact on the Earth’s Climate – Tiny" from paragraph "And continue: " there are these non-characters in the equation (12) and subsequently.
Another page : http://www.zimbabwesituation.com/sep26_2010.html in the topic " Red warning lights" .
Most web-pages I read are without problem.
I contacted the writer of the first page and s/he had no idea why it happens.

Character encoding: Ansi, ascii, and mac, oh my!

I'm writing a program which has to search & replace data in user-supplied Rich Text documents (.rtf). Ideally, I would like to read the whole thing into a StringBuffer, so that I can use all of the functionality built into String and StringBuffer, and so that I can easily compare with constant Strings and chars.
The trouble that I have is with character encoding. According to the rtf spec, RTFs can be encoded in four different character encodings: "ansi", "mac", IBM PC code page 437, and IBM PC code page 850, none of which are supported by Java (see http://impulzus.sch.bme.hu/tom/szamitastechnika/file/rtfspec/rtfspec_6.htm#rtfspec_8 for the RTF spec and http://java.sun.com/j2se/1.3/docs/api/java/lang/package-summary.html#charenc for the character encodings supported by Java).
I believe, from a bit of googling, that they are all 8 bits/character, so I could read everything into a byte array and manipulate that directly. However, that would be rather nasty. I would have to be careful with the changes that I make to the document, so that I do not insert values that do not encode correctly in the document's character encoding. Overall, a large hassle.
So my question is - has anyone done something like this before? Any libraries that will make my job easier? Or am I missing something built into Java that will allow me to easily decode and reencode these documents?

DrClap, thanks for the response.
If I could map from the encodings listed above (which are given in the rtf doucment) to a java encoding name from the page that you listed, that would solve all my problems. However, there are a couple of problems:
a) According to this page - http://orwell.ru/info/diffs.htm - ANSI is a superset of ISO-8859-1. That page isn't exactly authoritative, but I can't afford to lose data.
b) I'm not sure what to do about the other character encodings. "mac" may correspond to "MacRoman" but that page lists a dozen or so other macintosh encodings. Gotta love crystal-clear MS documentation.

How I set character encoding for everypage and alway?

I use Thai window 874 open the page when I select some website it contain Thai then click open new tab it change to western windows 1252. It can not display Thai. I must set character encoding to Thai windows 874 everytime.

Try this:
Prefrences -> Content -> Fonts & Color -> Advanced
At the bottom, choose your Encoding.

Character encoding with CF and MySQL

Okay, I thought this should be rather straight forward but
apparently not. I have set up my site to use UTF-8— my cfm
pages, the MySQL table, even Dreamweaver. The problem is when I
input international character via a form they get written correctly
to the MySQL table; however, when I retrieve them in a query and
display them on the page I get them displayed incorrectly.
On my input.cfm page I'll enter the string
"Téstïñg" in the textbox and submit it. If I look at
the record via the MySQL Browser it appears as it should. However
when I display it on my output.cfm page it shows the record as
"T�st��g" and will do so until I change the
meta tag to use charset=ISO-8859-1. Am I missing something or is
this how it is suppose to work?
My input.cfm page is set up with both the
<cfprocessingdirective suppresswhitespace="YES"
pageencoding="UTF-8">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8">
tags and a regular input formfield that writes to the MySQL
database.
The MySQL table is configured to use the utf8 char set and
utf8_unicode_ci collation.
And just to be safe I included
useUnicode=true&characterEncoding=utf8&characterSetResults=utf8
in the connection string on the CF Admin datasource setup page.
I'm running CF 6.1, MySQL 4.1, the latest version of Apache
Server on a Win2K3 box. I was running the 3.0.16 MySQL JDBC driver
but I upgraded it to the 5.0.6 this morning thinking that may fix
my issue.

I'm still unsure why this works but I've found a solution. I
switched all my pages over to character set ISO-8859-1 with the
exception of my database table and it works. I get all the normal
range character along with the extended Unicode characters to write
to the database and output correctly. Unicode characters actually
write to the table as their HTML coded character.
If someone feels the need to enlighten me as to why this
works please feel free, I'm always willing to learn.

Why, after all these years, can't Thunderbird auto-detect character encoding

judging by all the existing messages and complaints about this, not to mention erroneous posts that say the problem is solved when it isn't, I have to conclude Mozilla either doesn't believe this is a problem or doesn't care to fix it. The bottom line is that there is no way to tell Thunderbird to automatically display emails in the character coding format they were written in. I could understand cases where the headers are not properly filled in, but I see tons of emails in which the encoding is plainly there in the headers within the message source. You can force it, but if you do so via the menu VIEW->Character Encoding->UTF8 (for example) it won't "stick" if you view another message. But who would want it to "stick" permanently anyway? What the average user really wants is to be able to toggle VIEW->Character Encoding->Auto Detect from its default "off" to simply "on", and not have to bother with it anymore.
This is a problem that seems to have gone on forever, and it NEVER happens with other email clients. If there is some backdoor way to actually make autodetect work, I'd appreciate knowing about it. But more important, I think ALL users would appreciate it if it were not some secret "backdoor" setting, but a simple global menu choice for all accounts. Can Mozilla please fix this problem once and for all?

You said...
''Thunderbird is supposed to be using the encoding in the mail.''
I figured is "should", i'm just reporting that it doesn't
You said...
''Setting auto detect to on disables that.''
Please explain. I've looked at every setting I can find and there is no way to set auto detect to "ON". I DID try setting it to "universal" in an attempt top solve the problem, but I have since restored it to "off", because the universal setting doesn't help.
you said...
''"Based on your earlier response I assume you need to press the F10 key to see the tools menu you were refered to." ''
No... I never said that anywhere. I DID refer to Menu->View_>Character Encoding, and I did refer to right clicking on individual folders, to get to the properties dialog, and the general information tab. But F-10 doesn't do anything
You said...
''I have examines dozens of mails in my inbox and each honours the character encoding set in the HTML''
Well, mine NEVER did. A short example from an email I got today pretty much is exemplative of all mail I get from GMAIL...
--089e013a0572a067a404fc73ceda
Content-Type: text/plain; charset=UTF-8
Ok, very good. Thank you. Phoenix sent you a friend request on Facebook by
the way. Talk to you soon.
--089e013a0572a067a404fc73ceda
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<p dir=3D"ltr">Ok, very good. Thank you. Phoenix sent you a friend request=
=C2=A0 on Facebook by the way.=C2=A0 Talk to you soon.</p>
--089e013a0572a067a404fc73ceda--
See those incidences pf "=C2=A0"? Each one displays as a strange character, a capitol A with a curved line over it. If I manually set my default encoding to UTF 8, the weird characters go away. If I leave it as Western, there is nothing I can do to tell Thunderbird to "auto detect".
Anyway, I suppose at this point that no one responsible for the product coding is seriously looking at my issue, which is why its never been solved. If anyone does intend to help track it down and solve it, I'll be happy to provide all the examples and screen shots they ask for. Otherwise.

Locale and character encoding. What to do about these dreadful ÅÄÖ??

It's time for me to get it into my head how this works. Please, help me understand before I go nuts.
I'm from Sweden and we use a few of these weird characters like ÅÄÖ.
If I create a file called "övrigt.txt" in windows, then the file will turn up as "?vrigt.txt" on my Linux pc (At least in the console, sometimes it looks ok in other apps in X). The same is true if I create the file in Linux and copy it to Windows, it will look just as weird on the other side.
As I (probably) can't change the way windows works, my question is what I have to do to have these two systems play nicely with eachother?
This is the output from locale:
LANG=en_US.utf8
LC_CTYPE="en_US.utf8"
LC_NUMERIC="en_US.utf8"
LC_TIME="en_US.utf8"
LC_COLLATE=C
LC_MONETARY="en_US.utf8"
LC_MESSAGES="en_US.utf8"
LC_PAPER="en_US.utf8"
LC_NAME="en_US.utf8"
LC_ADDRESS="en_US.utf8"
LC_TELEPHONE="en_US.utf8"
LC_MEASUREMENT="en_US.utf8"
LC_IDENTIFICATION="en_US.utf8"
LC_ALL=
Is there anything here I should change? I have tried using ISO-8859-1 with no luck. Mind you that I want to have the system wide language set to english. The only thing I want to achieve is that "Ö" on widows should turn up as "Ö" i Linux as well, and vice versa.
Please save my hair from being torn off, I'm going bald here...

Hey, thanks for all the answers!
I share my files in a number of ways, but mainly trough a web application called Ajaxplorer (very nice btw...). The thing is that as soon as a windows user uploads anything with special chatacters in the file name my programs, xbmc, console etc, refuses to read them correctly. Other ways of sharing is through file copying with usb sticks, ssh etc. It's really not the way of sharing that is the problem I think, but rather the special characters being used sometimes.
I could probably convert the filenames with suggested applications but then I'll set the windows users in trouble when they want to download them again, won't I?
I realize that it's cp1252 that is the bad guy in this drama. Is there no way to set/use cp1252 as a character encoding in Linux? It's probably a bad idea as utf8 seems like the future way to go, but the fact that these two OS's can't communicate too well in this area is pretty useless if you ask me.
To wrap this up I'll answer some questions...
@EVRAMP: I'm actually using pcmanfm, but that is only for me and I'm not dealing very often with vfat partitions to be honest.
@pkervien: Well, I think I mentioned my forms of sharing above. (kul med lite arch-svenskar!)
@quarkup: locale.gen is edited and both sv.SE and en_US have utf-8 and ISO-8859 enabled and generated.
...and to clearify things even further. It doesn't matter if I get or provide a file via a usb stick, samba, ftp or by paper. All I want is for "Ö" to always be "Ö", everywhere.
I can't believe how hard this is to get around. Linus is finish for crying out loud. I thought he'd sorted this out the first thing he did. Maybe he doesn't deal with windows or their users at all

Why differing Character Encoding and how to fix it?

I have PRS-950 and PRS-350 readers, both since 2011.
In the last year, I've been getting books with Character Encoding that is not easy to read. In playing around with my browsers and View -> Encoding menus, I have figured out that it has something to do with the character encoding within the epub files.
I buy books from several ebook stores and I borrow from the library.
The problem may be the entire book, but it is usually restricted to a few chapters, with rare occasion where the encoding changes within a chapter. Usually it is for a whole chapter, not part, and it can be seen in chapters not consecutive to each other.
It occurs whether the book is downloaded directly to my 950 reader or if I load it to either reader from my computer(s), which are all Mac OS X of several versions fom 10.4 to Mountain Lion. SInce it happens when the book is downloaded directly, I figure the operating system of my computer is not relevant.
There are several publishers involved, though Baen (no DRM ebooks) has not so far been one of them.
If I look at the books with viewers on the computer, the encoding is the same. I've read them in Calibre, in the Sony Reader App, and in Adobe Digital Editions 2.0. It's always the same.
I believe the encoding is inherent to the files. I would like to fix this if I can to make the books I've purchased, many of them in paper and electronically, more enjoyable to read on my readers.
Example: Iâ€™ve is printed instead of I've.
â€™ for apostrophe
â€œ the opening of a quotation,
â€? for closing the quotation,
and I think â€” is for a hyphen.
When a sentence had â€œâ€™m for " 'm at the beginning of a speech (when the character was slurring his words) it took me a while to figure out how it was supposed to read.
â€œâ€™Sides, â€™tis only for a moon. That ainâ€™t long.â€?
was in one recent book.
Translation: " 'Sides, 'tis only for a moon. That ain't long."
See what I mean?
Any ideas?

Hi
I wonder if it’s possible to download a free ebook with such issue, in order to make some “tests”.
Perhaps it’s possible, on free ebooks (without DRM), to add fonts by using softwares like Sigil.

Reading Advance Queuing with XMLType payload and JDBC Driver character encoding

Hi
I've got a problem retrieving the message from the queue with XMLType payload in Java.
It was working fine in 10g database but after the switch to 11g it returns corrupted string instead of real XML message. Database NLS_LANG setting is AL32UTF8
It is said that JDBC driver should deal with that automatically but it obviously don't in this case. When I dequeue the message using database functionality (DBMS_AQ package) it looks fine but not when using JDBC driver so Ithink it is character encoding issue or so. The message itself is enqueued by the database and supposed to be retrieved by dedicated EJB.
Driver file used: ojdbc6.jar
Additional libraries: aqapi.jar, xdb.jar
All file taken from 11g database installation.
What shoul dI do to get the xml message correctly?

Do you mean NLS_LANG is AL32UTF8 or the database character set is AL32UTF8? What is the database character set (SELECT value FROM nls_database_parameters WHERE parameter='NLS_CHARACTERSET')?
Thanks,
Sergiusz

Problems with Forms and character encoding

I'm having problems trying to read unicode data inputted into a Form on my JSP page.
I've used the meta tag <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> to set the charset of the page to UTF-8. I've inputted some chinese characters inot my form and when I try to read the subsequent request parameter in my servlet using request.getParameter() the string returned is this
"来源" which is the escape sequence required by HTML to display these characters.
From what I've read on the subject this doesn't seem like the expected value. I've tried other ways of getting the correct string value such as setting the character encoding request.setCharacterEncoding("UTF-8") and then converting the bytes using this encoding value but it doesn't seem to work.
I could write a method to split up the string using the ; as a token and working out the correct unicode character but this doesn't seem like the right thing to do.
Any help on how to pass the correct information from the Form in the JSP page to the servlet would be greatly appreciated

I don't believe that is correct, but if it's returning HTML escapes instead of URL Encoded characters, then it's the browser doing it. This is my test page for playing with Chinese...
<%@ page language="java" contentType="text/html; charset=UTF-8" %>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<title></title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body bgcolor="#ffffff" background="" text="#000000" link="#ff0000" vlink="#800000" alink="#ff00ff">
<%
request.setCharacterEncoding("UTF-8");
String str = "\u7528\u6237\u540d";
String name = request.getParameter("name");
%>
req enc: <%= request.getCharacterEncoding() %><br />
rsp enc: <%= response.getCharacterEncoding() %><br />
str: <%= str %><br />
name: <%= name %><br />
<form method="GET" action="_lang.jsp" encoding="UTF-8">
Name: <input type="text" name="name" value="" >
<input type="submit" name="submit" value="GET Submit" />
</form>
<form method="POST" action="_lang.jsp" encoding="UTF-8">
Name: <input type="text" name="name" value="" >
<input type="submit" name="submit" value="POST Submit" />
</form>
</body>
</html>

Set character encoding for data template xml output

Hello everyone, in my data template, I have defined the header as
<?xml version="1.0" encoding="WINDOWS-1256"?>
but when output is generated, it is returned as:
<?xml version="1.0" encoding="UTF-8"?>
Is there a way for me to force the WINDOWS-1256 encoding in my data template?
Many Thanks

This data is read as
bytes then I am using the InputStreamReader to convert
to UTF-8 encoding.Don't you mean "from UTF-8 encoding"? Strings don't have an encoding, bytes can. And do you know that SQL Server produces those bytes encoded in UTF-8, or are you just assuming that?
The stream is then written to a file with the
extension ".xml". When I go and open the file, I get
errors stating that the characters were not
recognized.When you open the file with what? And what errors do you get?
However, when I open the file with
Notepad, I can see my xml data.

Web pages display OK, but print with garbage characters. I think it's character encoding, but don't know WHICH I should use. Have tried all Western and UTF options. Firefox 3.6.12

I used to only have troubles with headers & footers printing out as garbage characters. I tried changing Character Encoding, now entire pages have garbage characters, even though pages view ok when browsing.

If the pages look OK when you are browsing then it is not a problem with the encoding.<br />
It can be a problem with the font that is used and you can try to disable website fonts and posibly try a few different default fonts to see if that helps.
Tools > Options > Content : Fonts & Colors: Advanced (Allow pages to choose their own fonts, instead of my selections above)

Character encoding and ByteOutputStream

Hi!
I'm currently working on a web application that needs to print non-english characters (e g swedish � � �). Currently this doesn't work, although i have set the character encoding for the HttpServletResponse.
I figure that it is this code that doesn't manage the non-english characters (it's not my own code but i need to fix it and, sorry to say, I'm not that experienced with streams):
ByteArrayOutputStream baos = new ByteArrayOutputStream(16384);
baos.write("��".getBytes());
resp.setDateHeader("Expires", 0);
resp.setContentLength(baos.size());
ServletOutputStream out = resp.getOutputStream();
out.write(baos.toByteArray());
out.flush();
out.close();Any hints on what to do or where to look? Should i wrap the ServletOutputSream in a Writer?
Cheers,
David

But there's no character encoding set for this operation:baos.write("��".getBytes());

Character Encoding and File Encoding issue

Hi,
I have a file which has a data encoded using default locale.
I start jvm in same default locale and try to red the file.
I took 2 approaches :
1. Read the file using InputStreamReader() without specifying the encoding, so that default one based on locale will be picked up.
-- This apprach worked fine.
-- I also printed system property "file.encoding" which matched with current locales encoding (on unix cooand to get this is "locale charmap").
2. In this approach, I read the file using InputStream as an array of raw bytes, and passed it to String contructor to convert bytes to String.
-- The String contained garbled data, meaning encoding failed.
I tried printing encoding used by JVM using internal class, and "file.encoding" property as well.
These 2 values do not match, there is weird difference.
For e.g. for locale ja_JP.eucjp on linux box :
byte-character uses EUC_JP_LINUX encoding
file.encoding system property is EUC-JP-LINUX
To get byte to character encoding, I used following methods (sun.io.*):
ByteToCharConverter btc = ByteToCharConverter.getDefault();
System.out.println("BTC uses " + btc.getCharacterEncoding());
Do you have any idea why is it failing ?
My understanding was, file encoding and character encoding should always be same by default.
But, because of this behaviour, I am little perplexed.

But there's no character encoding set for this operation:baos.write("��".getBytes());

Character encoding in Drafts and Templates won't "stick"

Similar Messages

Maybe you are looking for