Garbage characters when retrieving HTML via Java
I wanted to use Java to extract my characters profile from http://www.magelo.com. The results returned from the URL are basically garbled characters; however retrieving cnn.com or yahoo.com the results are fine. So, the only thing I can think of is that the magelo webserver detects that its Java and, since it may not want people to not use this approach for mining concerns, changes the output to garbage. I tried setting the connections property "User-Agent" and others to mimic a browser, but nothing worked...the only way I can view non-garbaled HTML is by using a webbrowser. Not exactly sure that this is what is going on -- but I can't come up with another explaination.
code & garbage.
Similar Messages
-
When using File Upload functionality of Servlet3 specification, other item's value(<input type="text">) was garbage characters.
Need special settings?
WebLgic12c(12.1.2.0)のファイルアップロード機能(Servlet3仕様の機能)にて、アップロードファイル以外の項目の値が文字化けしました。
これは、何か設定が必要なのでしょうか?
【Note】
When normal request(application/x-www-form-urlencoded), submitted value is not garbage characters.
Filename & File content of uploaded file is not garbage characters.
I confirmed by debugger that stored value in temporary file is not garbage characters.
HttpServletRequest#setCharacterEncoding("UTF-8") is used.
enctype="multipart/form-data"を指定しないリクエストでは文字化けは発生していません。
アップロードしたファイルのファイル名及びアップロードファイルの中身自体は文字化けしていない。
アップロード時に出力される一時ファイルの中身をデバッグ実行して確認したところ、この段階では文字化けしていなかった。
HttpServletRequest#setCharacterEncoding("UTF-8")も実行しています。
【Environment Information】
OS : MacOS X 10.8.5
JVM : Oracle Java7
VM Encoding : UTF-8 (-Dfile.encoding=UTF-8)
WebPage Encoding : UTF-8
OS LANG : LANG=ja_JP.UTF-8
IDE : STS(Spring Tool Suite)
Boot Platform : WTP for Weblogic12.1.2.0
Framework : Spring MVC(3.2.4)
I want to know how to solve this behavior.
なにかご存知の方いましたら、解決方法をご教授頂ければと思います。
Message was edited by: user11123661 modified main language.(japanese -> english).The basic problem is not obscure, it has come up countless times since Tiger was released. See this note and try Fix C (dingbat) to see if it will help:
http://homepage.mac.com/thgewecke/woutlook.html -
Leading garbage characters when using CipherInputStream
So, after receiving an encrypted message, I can decrypt it perfectly except that I get a random amount of leading garbabe characters. Using the same plaintext, here are examples of the beginning of the output file for two runs (using od -c to look at the files):
0000000 315 7 004 371 242 \0 w ` t h e L L S E
and
0000000 1 " 246 317 0 j 321 V t h e L L S E
The part beginning with "The LLSE..." is correct.
The fact that the leading garbage appears to be random leads me to believe that there is something wrong with the crypto rather with the file I/O.
Here's the relevant code chunk:
Cipher sharedCypher = Cipher.getInstance("DES/CFB8/PKCS5Padding");
SecretKeySpec DESKeySpec = new SecretKeySpec(clientDESKey.getEncoded(), "DES");
IvParameterSpec iv = new IvParameterSpec(clientDESKey.getEncoded());
sharedCypher.init(Cipher.DECRYPT_MODE, DESKeySpec, iv);
CipherInputStream cis = new CipherInputStream(new FileInputStream(WD), sharedCypher);
String time = ""+System.currentTimeMillis();
File outputFileW = new File("/tmp/new_wireless_data."+time);
System.out.println(System.currentTimeMillis()+": Output file is" + outputFileW.getAbsolutePath());
outputFileW.createNewFile();
FileOutputStream fos = new FileOutputStream(outputFileW);
byte[] putCypherBytes = new byte[8];
int i=0;
while((i=cis.read(putCypherBytes)) != -1) {
fos.write(putCypherBytes, 0, i);
}Any thoughts on cleaning this up would be greatly appreciated.
Best,
GlennWell, I seriously doubt its the JDK or JCE. I dont have time right now to
download that exact version and test it, but instead I'll give you the code
I pieced together from your posts. Put your text to be encrypted in a file
called plaintext.txt in the directory you run the class in. The decrypted
text should be appear in new_wireless_data...
If you can run this without error, then your problem most likely lies
on your server-side.
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import javax.crypto.Cipher;
import javax.crypto.CipherInputStream;
import javax.crypto.SecretKey;
import javax.crypto.spec.IvParameterSpec;
import javax.crypto.spec.SecretKeySpec;
import sun.misc.BASE64Decoder;
public class CryptoTest {
public static String encryptedFile = "encrypted.txt";
public static String plainTextFile = "plaintext.txt";
public static SecretKey clientDESKey;
public static String keyString = "emePXfjmLNw=";
public static void main(String [] args) {
try {
BASE64Decoder base64Decoder = new BASE64Decoder();
byte[] keyBytes = base64Decoder.decodeBuffer(keyString);
clientDESKey = new SecretKeySpec(keyBytes, "DES");
_main(args);
} catch (Exception e) {
System.out.println("ERROR! : " + e.getMessage());
e.printStackTrace();
public static void _main(String [] args) throws Exception {
Cipher serverCypher = Cipher.getInstance("DES/CFB8/PKCS5Padding");
serverCypher.init(Cipher.ENCRYPT_MODE, clientDESKey,
new IvParameterSpec(clientDESKey.getEncoded()));
File inputFileW = new File(plainTextFile);
CipherInputStream cis1 =
new CipherInputStream(
new FileInputStream(inputFileW),serverCypher);
File outputFileW1 = new File(encryptedFile);
FileOutputStream fos1 = new FileOutputStream(outputFileW1);
byte[] putCypherBytes1 = new byte[8];
int i1=0;
while((i1=cis1.read(putCypherBytes1)) != -1) {
fos1.write(putCypherBytes1, 0, i1);
inputFileW.delete();
//=========================================================
Cipher sharedCypher = Cipher.getInstance("DES/CFB8/PKCS5Padding");
SecretKeySpec DESKeySpec = new SecretKeySpec(clientDESKey.getEncoded(), "DES");
IvParameterSpec iv = new IvParameterSpec(clientDESKey.getEncoded());
sharedCypher.init(Cipher.DECRYPT_MODE, DESKeySpec, iv);
CipherInputStream cis = new CipherInputStream(new FileInputStream(encryptedFile), sharedCypher);
String time = ""+System.currentTimeMillis();
File outputFileW = new File("new_wireless_data."+time);
System.out.println(System.currentTimeMillis()+": Output file is" + outputFileW.getAbsolutePath());
outputFileW.createNewFile();
FileOutputStream fos = new FileOutputStream(outputFileW);
byte[] putCypherBytes = new byte[8];
int i=0;
while((i=cis.read(putCypherBytes)) != -1) {
fos.write(putCypherBytes, 0, i);
} -
Garbage characters when using CR XI R1 with Sharp M350 printer PCL5e driver
My application uses CR XI R1. We have a customer who has a Sharp M350 printer. When printing or previewing a report from my application with this printer set as the default, the text is garbled - the characters look like gibberish.
If they change the default printer to another printer, it works fine. but they need to be able to print to the Sharp printer so I need to find a solution to the problem.
Changing the printer driver to a PCL6 has no effect. Does anyone have any suggestions for fixing this?A few questions:
1) What development language are you using?
2) Have you ever applied any CR Service Packs?
3) Have you checked for any updated for the Sharp M350 printer driver?
4) Can you duplicate the issue on your development system?
5) Are you able to print correctly to that printer using the CR designer? (even if you have to as a test, install it on one of the client machines?)
6) What Crystal Reports SDK are you using? RDC? If so, what is the CR dlls referenced in your app?
Ludek -
Garbage characters displayed when saving InfoView Page Layout in Chinese
HI,
I'm new to BO, so I don't know whether I'm giving required information or not. We integrated BO with our product.Following are the configuration details.
Server: win2k3 sp2
DataBase: Embedded Sybase SQL
Clinet: winxp sp3(Chinese) + FF3.6.8
I logged in to BO InfoView with Chinese language as "Product Locale". Went to InfoView Page Layout Page , selected Save As option. Now I can see some garbage characters along with some Chinese characters in the Title text field. I want to eliminate those garbage characters.
I want to know the following.
1. what could be the reason to get such type of garbage characters there ?
2. How to eliminate them ?
3.Are we missing any required data while integrating BO with our product due to which this problem comes ?
Thanks in Advance,
VasuHI,
I'm new to BO, so I don't know whether I'm giving required information or not. We integrated BO with our product.Following are the configuration details.
Server: win2k3 sp2
DataBase: Embedded Sybase SQL
Clinet: winxp sp3(Chinese) + FF3.6.8
I logged in to BO InfoView with Chinese language as "Product Locale". Went to InfoView Page Layout Page , selected Save As option. Now I can see some garbage characters along with some Chinese characters in the Title text field. I want to eliminate those garbage characters.
I want to know the following.
1. what could be the reason to get such type of garbage characters there ?
2. How to eliminate them ?
3.Are we missing any required data while integrating BO with our product due to which this problem comes ?
Thanks in Advance,
Vasu -
Garbage characters in iTunes when copying songs in chinese language
I copied some English and Chinese songs from my external harddisk to iTunes. Instead of displaying the artists and song name in simplified/traditional chinese for the Chinese songs, it is showing garbage characters. The English songs are displaying correctly. What should I do?
Tried to convert ID3 tag(all ID3 versions and reverse unicode) but still doesn't work.
They may need to be converted to Unicode from a legacy chinese encoding. Try ConvertZ
http://www.bumpersoft.com/Educationand_Science/Languages/ConvertZ12649.htm -
Garbage Characters in Netscape
Hi,
I open a 'jsp' in a new popup (window.open). Netscape shows some garbage characters at the top of the page. (Works fine with IE). I tried deleting ALL the code from the popup jsp to trace out the problem (even removed the HTML tags). But Netscape still shows the garbage characters. Where are they coming from??? Any help is appreciated!
Ashish BhaveHi,
That is very strange, I've seen this only once before when reporting an error into an error cluster indicator over a real time target but this was a one time event. Does the PC you're on have any issues other than this, i.e. occasional Blue Screens or crashes? The only thing I can think of is a memory location on your PC that's having issues and occasionally LabVIEW is using this space.
It may be worth calling into your local branch or e-mailing direct via www.ni.com/support they may recommend a re-installation of the NI software and provide you with a tool to ease this process. But this is certinaly the first time I've heard of this on the LabVIEW dialog boxes! Have any changes been made to the machine itself in terms of language additions or software addition/removal?
Kind Regards,
Applications Engineer -
Calling EJB with HTML via SERVLET
Hi,
I used a writen example that calls EJB from HTML via SERVLET. Example name is Bonus. The problem I have is that the HTML throw error while calling SERVLET. I dont figure out what seams to be a problem. Someone know?
I wonder if the problem is in servlet? The EJB is fine!
christian
HTML CODE:(bonus.html)
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1250"/>
<TITLE>untitled1</TITLE>
</HEAD>
<BODY BGCOLOR = "WHITE">
<BLOCKQUOTE>
<H3>Bonus Calculation</H3>
<FORM METHOD="GET" ACTION="BonusAlias">
<P>Enter social security Number:<P>
<INPUT TYPE="TEXT" NAME="SOCSEC"></INPUT>
</P>
Enter Multiplier:
<P>
<INPUT TYPE="TEXT" NAME="MULTIPLIER"></INPUT>
</P>
<INPUT TYPE="SUBMIT" VALUE="Submit">
<INPUT TYPE="RESET">
</FORM>
</BLOCKQUOTE>
</BODY>
</HTML>
SERVLET CODE:(BonusServlet.java)
package mypackage5;
import mypackage5.Calc;
import mypackage5.CalcHome;
import mypackage5.impl.CalcBean;
import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
import javax.naming.*;
import javax.rmi.PortableRemoteObject;
import java.beans.*;
public class BonusServlet extends HttpServlet {
CalcHome homecalc;
public void init(ServletConfig config) throws ServletException{
//Look up home interface
try{
//InitialContext ctx = new InitialContext();
//Object objref = ctx.lookup("Calc");
//homecalc = (CalcHome)PortableRemoteObject.narrow(objref, CalcHome.class);
Context context = new InitialContext();
CalcHome calcHome = (CalcHome)PortableRemoteObject.narrow(context.lookup("Calc"), CalcHome.class);
Calc calc;
catch (Exception NamingException) {
NamingException.printStackTrace();
public void doGet (HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
String socsec = null;
int multiplier = 0;
double calc = 0.0;
PrintWriter out;
response.setContentType("text/html");
String title = "EJB Example";
out = response.getWriter();
out.println("<HTML><HEAD><TITLE>");
out.println(title);
out.println("</TITLE></HEAD><BODY>");
try{
Calc theCalculation;
//Get Multiplier and Social Security Information
String strMult = request.getParameter("MULTIPLIER");
Integer integerMult = new Integer(strMult);
multiplier = integerMult.intValue();
socsec = request.getParameter("SOCSEC");
//Calculate bonus
double bonus = 100.00;
theCalculation = homecalc.create();
calc = theCalculation.calcBonus(multiplier, bonus);
catch (Exception CreateException){
CreateException.printStackTrace();
//Display Data
out.println("<H1>Bonus Calculation</H1>");
out.println("<P>Soc Sec: " + socsec + "<P>");
out.println("<P>Multiplier: " +
multiplier + "<P>");
out.println("<P>Bonus Amount: " + calc + "<P>");
out.println("</BODY></HTML>");
out.close();
public void destroy() {
System.out.println("Destroy");The error is that page cannot be found! When I run only the servlet it works, when I run the HTML page and enter the field throws eror that the page cannot be found!
thanks
Christian -
Removing numerals/garbage characters from search in Section 508 build WebHelp
I need to remove (prevent inclusion of) numerals and garbage characters from search results in WebHelp when compiled with Section 508 Output enabled. I need to have Section 508 enabled. Can that be done?
Thanks for your time!Thanks, Jeff. I tried including the characteres in the Stop list and recompiling. (I even closed the project and reopened.) The characters still appear. The characters I'm trying to remove are: !, #, ', (, ), -, /, :, ;, and ,. I am also trying to exclude some numbers (100.00usd, 1999.99, 2999.99, 3999.99, 4999.99, and 6pm).
The Stop list consists of those characters and the default text. Forgot to mention is the first post ... I'm using RH9.
The Section 508 flag does create 508-compliant HTML output and that is one of the requirements for this help system. This is the first time I am enabling this flag. -
AIX + Korean : garbage characters
I have tomcat installed on AIX machine. When the user enters some korean characters I get some garbage characters at tomcat side. This issue is only if tomcat is on AIX machine. Anybody aware of this issue?
and again missunderstood ;-)
begin owa_util.print_cgi_env; end;
=
PLSQL_GATEWAY = WebDb
GATEWAY_IVERSION = 3
SERVER_SOFTWARE = Oracle-Application-Server-10g/9.0.4.0.0 Oracle-HTTP-Server
GATEWAY_INTERFACE = CGI/1.1
SERVER_PORT = 7780
SERVER_NAME = los-bd4.intranet.l-os.de
REQUEST_METHOD = POST
PATH_INFO = /wwv_flow.show
SCRIPT_NAME = /pls/htmldb
REMOTE_ADDR = 10.220.110.200
SERVER_PROTOCOL = HTTP/1.1
REQUEST_PROTOCOL = HTTP
REMOTE_USER = HTMLDB_PUBLIC_USER
HTTP_CONTENT_LENGTH = 297
HTTP_CONTENT_TYPE = application/x-www-form-urlencoded
HTTP_USER_AGENT = Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3
HTTP_HOST = 10.220.110.22:7780
HTTP_ACCEPT = text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
HTTP_ACCEPT_ENCODING = gzip,deflate
HTTP_ACCEPT_LANGUAGE = de-de,de;q=0.8,en;q=0.5,en-us;q=0.3
HTTP_ACCEPT_CHARSET = ISO-8859-1,utf-8;q=0.7,*;q=0.7
HTTP_REFERER = http://10.220.110.22:7780/pls/htmldb/f?p=4500:1003:2050165973690504::NO:::
HTTP_ORACLE_ECID = 1177512000:10.220.110.22:2296:0:624,0
WEB_AUTHENT_PREFIX =
DAD_NAME = htmldb
DOC_ACCESS_PATH = docs
DOCUMENT_TABLE = wwv_flow_file_objects$
PATH_ALIAS =
REQUEST_CHARSET = AL32UTF8
REQUEST_IANA_CHARSET = UTF-8
SCRIPT_PREFIX = /pls
HTTP_COOKIE = ISCOOKIE=true; ORACLE_PLATFORM_REMEMBER_UN=MERKER:zentral; WWV_FLOW_USER2=5352414403960629
Anweisung wurde verarbeitet. -
Garbage Characters in CDE Window Title Bar
I recently patched our Solaris 8 Sun workstations (a mixture of Blade 150, Ultra10, Blade 1500, and Blade 2000) with the recommended patch cluster from December 19th, 2005. At the same time, I updated the systems' Java to 1.4.2_10, and installed Update 5 to StarOffice7, which is still on the systems along with the newer StarOffice8.
When initially installed before these recent patches, StarOffice8 behaved correctly. After updating the systems, when I open StarOffice8 in a CDE session, I get garbage characters in the title bar of the window. The letters and fonts in the application menus are normal. StarOffice8 windows under Gnome sessions are titled correctly.
I've read some emails about issues with LC_CTYPE and setting it to en_US.ISO8859-15, using a wrapper script to start soffice. The current setting on my machines is en_US.ISO8859-1. That solution doesn't work consistently on different machines (or even on the same machine).
In fact, even without attempting to solve the problem, there is inconsistent behavior. On one Ultra10 which I have as a testbed machine, StarOffice8 behaves correctly, whether I'm logged in as a normal user, or as root. On a Blade 1500, it behaves correctly when logged in as root, but not as a normal user. I've also patched StarOffice8 with Update1 which was released today, but it doesn't fix the problem.
Anybody else having similar problems after recent patching? Or have any suggestions for a solution?
Jeff BaileyMore Info on Unsolved Problem:
If I use Mozilla to open a document file, with soffice set as the helper application, the CDE window title displays properly. If I leave that instance of StarOffice open, and open new documents using the menus within StarOffice, subsequent CDE titles also display normally. Also with that original Mozilla-driven StarOffice application still open, if I use "soffice whatever.doc" on a command line, those CDE windows appear normal.
Mozilla is set to use "en_US" as the language for webpages, with Western ISO-8859-1 as the default character coding. The Solaris 8 workstation's /etc/default/init file is configured as:
TZ=US/Eastern
CMASK=022
LC_COLLATE=en_US.ISO8859-1
LC_CTYPE=en_US.ISO8859-1
LC_MESSAGES=C
LC_MONETARY=en_US.ISO8859-1
LC_NUMERIC=en_US.ISO8859-1
LC_TIME=en_US.ISO8859-1
While the Mozilla "wrapper" is a workaround for now, I don't see it as a final solution.
Jeff Bailey -
Garbage characters in excel file opened by jsp
I am storing an excel file as blob in database. While retrieving
from database when I open in the jsp page , it shows a lot of garbage characters
and all formatting is lost. I am using content type "application/vnd.ms-excel".
I am also setting correct mime type in web.xml as application/excel. It is weblogic
6.1 sp2 with oracle 8.1.6. the pdf and word docs are working well. Please help
soon.
Thanks
Download the Open G Toolkit from www.openg.org. There is a VI called Quit Application.vi that works great. I have used it with the very stupid Brooks 0154 SmartDDE Controller program to reset the application.
Be sure to save the document and close the DDE communication first.
Michael
www.abcdefirm.com
Michael Munroe, ABCDEF
Certified LabVIEW Developer, MCP
Find and fix bad VI Properties with Property Inspector -
I uploaded my first podcast test <http://www.sooline.org/podcasts> and the iWeb pages display garbage characters from my ISP's Unix Web server. It looks fine when viewed from the folder on my hard drive. Is there anything I can do short of manually editing the HTML of every file every time I update the site? Any suggestions or recommendations would be much appreciated. -- Rick
Dual G5 Mac OS X (10.4.4)See this note for fixes:
http://homepage.mac.com/thgewecke/iwebchars.html -
Garbage characters in web pages
Produced a page in iWeb then uploaded it to my webspace using Secure FTP. Now I have garbage characters in it with  wherever there's a return and a bunch of garbage for every apostrophe.
Obviously I have some sort of problem with the character set but how to get rid of it?
I've tried copying text over from a TXT file without the returns, adding them later in iWeb, but it made no difference.When I open iWeb
and open this project, I have no idea which set iWeb
will end up using. I have to check the file
attributes afterward to figure out which one was the
set "du jour".
You are aware that iWeb does not open published files, right? It only opens the Domain.sites file where its data is kept:
http://homepage.mac.com/thgewecke/iwebdata.html
Also iWeb cannot register any changes you make in a published site with an editor, so that needs to be redone every time the site is republished.
Normally if you have multiple projects it would be a good idea to keep their data in separate Domain files.
PageSpinner file open function cannot see any folder
or file created by iWeb except the one called "Sites"
and that one is empty.
I have no problem using PageSpinner to open the files created by iWeb. Since iWeb does not create anything called "Sites," I'm wondering if the Open function was going instead to Home/Sites or some other location rather than Home/Documents/HTML or wherever you published your pages.
As far as my ISP goes, I'll take your word for it,
but I can't imagine an ISP of this size getting this
wrong and not receiving/responding to complaints.
None of the major companies devoted to hosting web pages which most people use have this problem as far as I know. ISP's for whom hosting is a sideline sometimes do not pay attention to the issue and leave their server set to force all browsers to use ISO-8859-1 encoding. They will not likely get any complaints if their clients are all using Roman script and traditional web editors which default to that setting as well. -
Something strange started happening this morning...
I am getting garbage characters in some, but not all of the dialog windows.
The screenshot attached occured when reloading a VI after changing its location. It's repeatable and happens again after a reboot.
Anyone seen this kind of behavior? Any solutions?
Something which may or may not be related - Some weeks ago when building a VI into an executable, the resulting front panel had similar characters in the menu bar. The only way i found to solve that one, was to select "support all languages" in the build specifications - run-time languages (from only English ticked before).
Labview 2010, TestStand 2010
Attachments:
dialog_error.jpg 115 KB
dialog_error.jpg 115 KBHi,
That is very strange, I've seen this only once before when reporting an error into an error cluster indicator over a real time target but this was a one time event. Does the PC you're on have any issues other than this, i.e. occasional Blue Screens or crashes? The only thing I can think of is a memory location on your PC that's having issues and occasionally LabVIEW is using this space.
It may be worth calling into your local branch or e-mailing direct via www.ni.com/support they may recommend a re-installation of the NI software and provide you with a tool to ease this process. But this is certinaly the first time I've heard of this on the LabVIEW dialog boxes! Have any changes been made to the machine itself in terms of language additions or software addition/removal?
Kind Regards,
Applications Engineer
Maybe you are looking for
-
Bridge will not Open a File in Photoshop
Yesterday my Bridge stopped opening files. I have to go to the file folder once I know the file name and open it the old fashioned way. Anyone know if there is a secret code to get it to open files again? This morning I went to the downloads and down
-
This issue is making me start to think that I have a defective peice of equipment. It seems as though i have to restart the hotspot multiple times for it to work and when it does work it drops the device after being tether to it, and then I have to
-
5512 all in one not printing black ink
My Photosmart 5512 isn't printing black ink. It also won't align which is odd. It doesn't "see" the alignment page when I try to scan it. I have done all the things that the HP troubleshoot site said (test pages, cleaning printheads, quality repor
-
How much data is actually sent to the client?
Hi, On the web application stuff I'm doing, I have a somewhat complex class structure for my objects. For example: I have a Project class and a Concept class, and technically you can access a Project object's assocated Concept object, but to do so yo
-
What is the use of OP operator in PDF creation and on which factor transparency is determined? q 0 g 0 G 1 w 10 M /GS4 gs /Fm3 Do Q PostScript code for Fm3 q 65.136 431.142 491.727 -307.179 re W n q /GS0 gs 493.7399902 0 0 308.3157654 64.0563507 123.