Writting file with RFile and reading with TFileText (unicode

Login to reply to this topic.
Fri, 2005-04-29 11:43
Joined: 2005-04-07
Forum posts: 21
Is it possible to write a file in unicode format?

I have used TFileText and TLex to read a file previously created in unicode format.

Now I want to write the file which will be read, but RFile::Write only accepts TDes8 and TFileText::Read only accepts TDes16, so if I have to write it with TDes8 I will not be able to read it as if it were unicode  Sad

Fri, 2005-04-29 11:50
Forum Nokia Champion
Joined: 2003-10-01
Forum posts: 723
Writting file with RFile and reading with TFileText (unicode
But if you have a buffer containing Unicode characters, then it souldn't matter if you write it into a file 8-bit by 8-bit.

tOtE

Gabor Torok
Software architect, Agil Eight (http://www.agileight.com/)
Blog: http://mobile-thoughts.blogspot.com/

Fri, 2005-04-29 11:53
Joined: 2004-07-28
Forum posts: 1379
Writting file with RFile and reading with TFileText (unicode
As tote said, it doesn't matter.

However, to be _really_ unicode, you should write the correct Byte Order Marker (BOM) to the start of the file.

Otherwise, some text editors will think it's ASCII and you'll end up with

H E L L O

didster

Fri, 2005-04-29 14:22
Joined: 2005-04-07
Forum posts: 21
Writting file with RFile and reading with TFileText (unicode
And how can I do that (BOM...)?

I have a TDes16 and I write it to a file with RFileWriteStream. Then I can read it ok with TFileText...

But, as you said, I obtain a strange file as an ASCII editor read it strange...
Fri, 2005-04-29 14:37
Joined: 2004-07-28
Forum posts: 1379
Writting file with RFile and reading with TFileText (unicode
Noooo, not RFileWriteStream.

That writes a TDesC out in external format.

That is with codes that say "hay, im a stream" and the length of the descriptor and all that.

Just use plain old RFile.

didster

Fri, 2005-04-29 14:42
Joined: 2005-04-07
Forum posts: 21
Writting file with RFile and reading with TFileText (unicode
But that is the way I found to write my TDes16 in a file, as RFile::Write is only for TDes8. I did not find another way to write 8-bit by 8-bit my unicode TDes16 in a file.

sorry, perhaps I am not taking the idea rightly   Huh
Fri, 2005-04-29 14:51
Joined: 2004-07-28
Forum posts: 1379
Writting file with RFile and reading with TFileText (unicode
OK, some explination.

RFileWriteStream isn't for writting to a file so you can read it with another app.  It's for writting to a file so you can read it back as a stream, from your symbian application.  It writes special magic codes along with the actual data that identify the stream type, and the data types as you write them.

Yes, RFile::Write expects TDesC8.  Thats because every file, even Unicode text files, ultimatly boil down to 8-bit data.

You have two choices.  Either you convert the unicode text to ascii (8-bit) and write that, or you really write the unicode data to the file.  Here is how you do both.

Firstly, the conversion:

Code:
RFile aAlreadyOpenedFile;
TBuf16<20> aMyFileData = ....

TBuf8<20> aMyFileDataAs8Bit;
aMyFileDataAs8Bit.Copy(aMyFileData);

aAlreadyOpenedFile.Write(aMyFileDataAs8Bit);

This copies the 16-bit data into a 8 bit descriptor (and disregards any non-ASCII characters as it does so) then writes the text to the file.

If the 16 bit descriptor contains "HELLO" - the contents of the file is:

HELLO.

Now, actually writting the unicode data to the file:


Code:
#define DES_AS_8_BIT(str) (TPtrC8((TText8*)((str).Ptr()), (str).Size()))

RFile aAlreadyOpenedFile;
TBuf16<20> aMyFileData = ....
aAlreadyOpenedFile.Write(DES_AS_8_BIT(aMyFileData));

What this does is basically casts (doesn't convert, just casts) the 16 bit descriptor to a 8-bit one so you can pass it to the file API.

Again, lets say the 16 bit descriptor contains "HELLO" -  the contents of the file now is:

H E L L O

That is, Unicode data.

Many text editors will read that file as is.  Some however, will require you also write the Unicode BOM at the start of the file to indicate the format of the file (endiness, code page etc).

If what I have showed you there doesn't work for you, ill show you how to write the BOM also.

didster

Fri, 2005-04-29 15:21
Forum Nokia Champion
Joined: 2003-10-01
Forum posts: 723
Writting file with RFile and reading with TFileText (unicode
Quote from: didster
Code:
RFile aAlreadyOpenedFile;
TBuf16<20> aMyFileData = ....

TBuf8<20> aMyFileDataAs8Bit;
aMyFileDataAs8Bit.Copy(aMyFileData);

aAlreadyOpenedFile.Write(aMyFileDataAs8Bit);

Note that your 8-bit buffer must be twice bigger then the Unicode. You know, 20 Unicode characters takes up 40 ASCII character slots. That is, aMyFileDataAs8Bit must be TBuf8<40>.

tOtE

Gabor Torok
Software architect, Agil Eight (http://www.agileight.com/)
Blog: http://mobile-thoughts.blogspot.com/

Fri, 2005-04-29 15:26
Joined: 2004-07-28
Forum posts: 1379
Writting file with RFile and reading with TFileText (unicode
Quote from: tote
Quote from: didster
Code:
RFile aAlreadyOpenedFile;
TBuf16<20> aMyFileData = ....

TBuf8<20> aMyFileDataAs8Bit;
aMyFileDataAs8Bit.Copy(aMyFileData);

aAlreadyOpenedFile.Write(aMyFileDataAs8Bit);

Note that your 8-bit buffer must be twice bigger then the Unicode. You know, 20 Unicode characters takes up 40 ASCII character slots. That is, aMyFileDataAs8Bit must be TBuf8<40>.

tOtE

Eh?  You sure...

20 Unicode characters is 20 ASCII characters.  The only difference is Unicode ones take up twice as much space per character - but thats ok, since each "slot" in a TBuf16 is twice that of each slot in a TBuf8.

If TBuf16 and TBuf8 were normal C arrays, the difference is sizeof(TBuf16) is twice sizeof(TBuf8).

So, in windows speak,

WCHAR szUniBuffer[20];

and

CHAR szAsciiBuffer[20];

Can store the same amount of characters, but not the same amount of bytes - i.e. sizeof(szAsciiBuffer) = 20, sizeof(szUniBuffer) = 40.  

Descriptors hide all that rubbish from you, and the tempate parameter is the size in characters - i.e. the same in both cases.

[/b]

didster

Fri, 2005-04-29 15:46
Joined: 2005-04-07
Forum posts: 21
Writting file with RFile and reading with TFileText (unicode
ok, I am getting it  Wink DES_AS_8_BIT works perfectly.

However, the example with

aMyFileDataAs8Bit.Copy(aMyFileData);

does not work for me. I think that is because Copy() "transforms" the TDes16 in a TDes8 character by character, and it does not copy 8-bit by 8-bit...
Fri, 2005-04-29 18:08
Joined: 2004-07-28
Forum posts: 1379
Writting file with RFile and reading with TFileText (unicode
Depends what's in the 16 bit descriptor.

If everything is representable in ASCII, it should work fine.

If you have characters in there which don't exist within the ASCII codepage (i.e. Unicode characters) of course it won't work - not really supprising!!  That's why someone invented Unicode.

There are posts on here about "better" ways to convert - that is ones that don't just chuck away non-ascii compatable Unicode characters.  But if your string does contain Unicode characters, the only way to really get whats in the buffer into the file is to write it in Unicode.

didster

Fri, 2005-04-29 22:04
Forum Nokia Champion
Joined: 2003-10-01
Forum posts: 723
Writting file with RFile and reading with TFileText (unicode
Quote from: didster
Quote from: tote
Quote from: didster
Code:
RFile aAlreadyOpenedFile;
TBuf16<20> aMyFileData = ....

TBuf8<20> aMyFileDataAs8Bit;
aMyFileDataAs8Bit.Copy(aMyFileData);

aAlreadyOpenedFile.Write(aMyFileDataAs8Bit);

Note that your 8-bit buffer must be twice bigger then the Unicode. You know, 20 Unicode characters takes up 40 ASCII character slots. That is, aMyFileDataAs8Bit must be TBuf8<40>.

tOtE

Eh?  You sure...

20 Unicode characters is 20 ASCII characters.  The only difference is Unicode ones take up twice as much space per character - but thats ok, since each "slot" in a TBuf16 is twice that of each slot in a TBuf8.

You're right if we're talking about characters, not single and double bytes. The original question did not mention characters, only that TFileText can handle 16-bit data as opposed to RFile::Write, which is capable of handling 8-bit data only.

Quote from: didster
If TBuf16 and TBuf8 were normal C arrays, the difference is sizeof(TBuf16) is twice sizeof(TBuf8).

Right.

Quote from: didster
Descriptors hide all that rubbish from you, and the tempate parameter is the size in characters - i.e. the same in both cases.

[/b]

Sorry, but I have to disagree. Many times descriptors are not used for string manipulation, but for handling binary data. In the current case, I'm not really sure for what we're using the descriptors.

I've just had a look at Symbian's online help about TBuf16 and it says:

"This is a descriptor class which provides a buffer of fixed length for containing, accessing and manipulating TUint16 data."

Not characters, sorry. If those bytes read by TFileText are binary data, then TDes8::Copy will surely fail, because

"Each double-byte value can only be copied into the corresponding single byte when the double-byte value is less than decimal 256. A double-byte value of 256 or greater cannot be copied and the corresponding single byte is set to a value of decimal 1."

tOtE

Gabor Torok
Software architect, Agil Eight (http://www.agileight.com/)
Blog: http://mobile-thoughts.blogspot.com/

Fri, 2005-04-29 22:50
Joined: 2004-07-28
Forum posts: 1379
Writting file with RFile and reading with TFileText (unicode
I made the assumption we were talking about characters here because Unicode is a concept which is only applicable to textual data.  16 bit discriptors are usually always used for character data.

Sure, descriptors (8-bit) are also used to work with raw binary data - one of the good (some people say) things about them.

Characters may be the wrong word - elements is better maybe.

Anyway, I never said .Copy would accuratly copy the string - only that it would never overflow the destination descriptor - i.e. this statment is not true:

"Note that your 8-bit buffer must be twice bigger then the Unicode. You know, 20 Unicode characters takes up 40 ASCII character slots. That is, aMyFileDataAs8Bit must be TBuf8<40>."

I can see what you're saying (I think).  I think you're saying that .Copy actually does a memory copy of one descriptor to the other....  It doesn't...  If it did, yes that statment is correct.  i.e. if you done:

Code:
TBuf16<20> a = ....
TBuf8<XXX> b;

memcpy(a.Ptr(), b.Ptr());

XXX would need to be 40.

Copy actually works by talking a TUint8 pointer to the unicode string.  One byte is copied into the destination discriptor, and the next is then skipped.  Since every other byte is skipped - it clearly shows the destination buffer does not need to be twice the size of the source.

As I said in my last post and you said - .Copy will only actually accuratly copy the string (or what ever) if every "character" in the string is < decimal 256 - otherwise, as you say, it will just write a 1.

didster

Fri, 2005-12-16 07:19
Joined: 2005-11-30
Forum posts: 11
Reading unicode text file and showing unicode font

I've made a Unicode text file having Unicode font representing Hindi char as प्रियंका  (saved that text file as ENCODING  = UNICODE)
Reading that file with the code -
RFs fs;
User::LeaveIfError( fs.Connect() );
_LIT( KStreamStoreName, "C:\\Unicode1.txt");
CleanupClosePushL(fs);
RFile file;
User::LeaveIfError(file.Open(fs, KStreamStoreName, EFileRead | EFileStreamText));
CleanupClosePushL(file);
TBuf<64> buf16;
TFileText aTxtFile;
aTxtFile.Set(file);
if(aTxtFile.Read(buf16) != KErrEof)
    {   
     const TDesC& aText16 = buf16;           
     iAppContainer->SetTextL(aText16);
     }
CleanupStack::PopAndDestroy(2)
I have to read that text and operate on that on charactor by charactor basis. But it is showing square boxes on emulator s60 2nd ed. fp2 
I know unicode is represented by square boxes when it is seen in non unicode environment.
May be that emularot  doesnt support unicode. I have read that same text file contaning unicode chars on Nokia 6600, same square boxes is being seen on that also.
Please let me know in detail how and what shud i do?
Its very much urgent ....
Tue, 2008-01-22 15:36
Joined: 2007-09-27
Forum posts: 3
Re: Writting file with RFile and reading with TFileText (unicode

Hi All,

I had similar problems, corresponding with use of German letters in 9.1 Symbian.
I have solved problem with using of

Class CnvUtfConverter

Defined in CnvUtfConverter:
ConvertFromUnicodeToUtf7(), ConvertFromUnicodeToUtf7L(), ConvertFromUnicodeToUtf8(), ConvertFromUnicodeToUtf8L(), ConvertToUnicodeFromUtf7(), ConvertToUnicodeFromUtf7L(), ConvertToUnicodeFromUtf8(), ConvertToUnicodeFromUtf8L()

Location: UTF.H //#include
Link against: charconv.lib // edit libs in .mmp

for additional information,
please refer to SDK help: » Symbian OS v9.1 » Symbian OS reference » C++ component reference » Syslibs CHARCONV_ONGOING » CnvUtfConverter

Cheers,
Broqua

  • Login to reply to this topic.