Reply
New Contributor
martij
Posts: 4
Registered: ‎04-03-2007

C8051F120 + CP2200

We've built a couple of boards with C8051F120's and CP2200's and got them to work with most of the TCP/IP examples and even some of our own. The programming was simple and straightforward. However, the transmit speed from the 8051 -> a PC is abysmal, even running the examples provided on the C8051FX20-TB + AB4 development kit. Beefing up the FTP example by adding 10,000 X's to the HTML file to transfer using the FTP service code, TCP-IP Config version 3.21, results in a transfer rand of 2.4kbytes/second (about 4.5 seconds to transfer 10678 bytes). We've tried a number of things:

1) Beef up the transmit buffer size. Sending 1k blocks gets the rate up to about 4.5k/second.
2) Direct Cp2200 -> PC with a crossover cable to avoid routers, switches and other traffic - no effect.
3) Using FULL_DUPLEX instead of letting the system arbitrate - no effect.
4) 3 different board designs including the AB4 development board and two of our own - no effect.
5) Built our own servers and firmware -> speed up to 4.5k by using bigger buffers. Then we wrote a PC program to simulate the transfer - 700k b/s transfer rate.

The problem is definitely on the 120/2200 side. We're ready to dump the CP2200 and switch to the serial port which can transfer at a rate of 10k bytes per second. This is pretty slow for dumping 16MB of data collected at high speed, but I'd rather use the 8K available for compression instead of the TCP stack.

What is going wrong?



------------------
Jed Marti
Jed Marti
Frequent Contributor
apemberton
Posts: 125
Registered: ‎12-02-2004

Re: C8051F120 + CP2200

Whilst I cannot comment on the particular implementation (because I'm not using the CMX stack), nor using a file transfer application a couple of things come to mind.

The CP220x is a 10baseT product whilst it is likely that your PC is using a 100Mbit/sec part, and thus collisions on a network are less likely (OK the back to back setup obviously shouldn't have many collisions).

The checksum generation is the most intensive part of the processing of a packet. I do not know if the CMX stack uses a C compilation or is an assembler code calculation. I certainly spent some time converting the C code for checksum calculation to a much smaller and faster assembler routine.

The whole setup is still not fast however!

Horses for courses.



------------------
Tony Pemberton
Tony Pemberton
New Contributor
martij
Posts: 4
Registered: ‎04-03-2007

Re: C8051F120 + CP2200

This is a C8051F120 which is mighty fast; I find it hard to believe that it's spending 250 milliseconds doing a checksum. I'm suspicious about the TCP/IP stack. My current belief, perhaps wrong, is that it is waiting for an acknowledge packet before sending the next. On systems with more memory there's usually a list of sent messages that might have to be resent. Since we only have room for a single message, we have to wait for an acknowledge before sending the next. The speed up solution is to write your own TCP/IP using UDP and play games with the response mechanism.
Jed Marti
Community Moderator
FarrisB
Posts: 43
Registered: ‎01-08-2003

Re: C8051F120 + CP2200

Hello Martij,

The TCP/IP stack will send one TCP packet then wait for the ACK before sending another one. Most PCs implement delayed acknowlegment, an algorithm for reducing network traffic, which allows them to hold the acknowlegement for up to 200ms. This means you can only send 5 TCP packets per second.

To overcome this, you can make a bi-directional TCP appliction (i.e. have the PC send back a packet each time it receives a packet). This will surely increase your TCP data rate.

Alternatively, you can use UDP which does not have this limitiation. With UDP, you can achieve upwards of 2-3Mbps.

Farris
New Contributor
martij
Posts: 4
Registered: ‎04-03-2007

Re: C8051F120 + CP2200

We tried that and it definitely helped but not a whole lot. We getting up to 157 kilobytes/second now which is acceptable but not stunning. Doing what's in AN 292 on the PC side was a good step but appears to cause other problems with short packets that we don't understand and don't have time to track down.

The UDP route is the best solution but then you're merely reimplementing the TCP stack - something's not right with somebody's code.
Jed Marti
Frequent Contributor
KevinH
Posts: 87
Registered: ‎07-21-2005

Re: C8051F120 + CP2200

There definitely could be room for improvement with the stack. Waiting for an acknowledge to every packet defeats the implementation of the TCP delayed ACK, which occurs when no packet is received for a certain time, usually 200ms by default. However, if two packets were to be sent in succession, an ACK would occur immediately in response to the second packet. I would think that waiting for an ACK after every other packet would significantly increase throughput.

Here's an excellent description of the problem:

http://www.stuartcheshire.org/papers/NagleDelayedAck/
Esteemed Contributor
Tsuneo
Posts: 6,303
Registered: ‎02-15-2004

Re: C8051F120 + CP2200

Isn't it the problem of chip hardware design?

The RX buffer on the chip consists of 4KB FIFO, but the TX is single 2KB buffer, not FIFO. If TX buffer is separated at least two stack FIFO, the packet is held there until the chip receives ACK, without impacting the MCU RAM space.

As an Ethernet chip, their choice of single TX buffer may be usual.
But when TCP/IP implementation is considered, SiLabs designer fails to catch the requirement of 8bit MCU, which is the target of this chip.

Maybe (s)he doesn't know how many bytes of RAM their MCU have.
Or, (s)he knows just Ethernet protocol, but nothing of TCP/IP.

Tsuneo

[This message has been edited by Tsuneo (edited June 28, 2007).]
Frequent Contributor
KevinH
Posts: 87
Registered: ‎07-21-2005

Re: C8051F120 + CP2200

It seems like the problem would still be with the stack, because even if you use a 512-byte buffer, which could place multiple packets in the TX buffer, the problem still exists. I was originally using your DHTML2 example, and got curious as to why the initial form was slow to display. That's when I took a look with the packet sniffer and saw that the system is waiting for an ACK after every packet. I would think it's the stack that is doing this by how it controls the CP2200 and not the HW itself.

I changed the buffer sizes to the maximum TCP size and there was a major performance improvement, but it still was waiting for an ACK after each packet. I guess this might require an in-depth examination of both the CP2200 and the stack...
Esteemed Contributor
Tsuneo
Posts: 6,303
Registered: ‎02-15-2004

Re: C8051F120 + CP2200

The reason the stack waits for ACK is simple.
It cannot discard the packet on the TX buffer until it receives ACK to the packet, because retry may occur. To handle next packet while waiting, another buffer is required. It means the size of TX buffer is doubled. If the chip holds the last packet, stack can handle next packet without doubling number of TX buffer.

To handle multiple TX packets at a time, corresponding number of buffer memory is required somewhere on the entire system. Then, where is the best place for the buffer? It's on-chip. If the buffer is placed on the MCU side, the data have to be reloaded to the Ether chip again.

Tsuneo

[This message has been edited by Tsuneo (edited June 28, 2007).]
New Contributor
martij
Posts: 4
Registered: ‎04-03-2007

Re: C8051F120 + CP2200

We're using them for some temporary data collection but probably would think twice about using them again. By the time you add sufficient RAM to get the TCP stack going, you've complicated the beast considerably - size and power requirements are reaching that of 16 bit systems.

The POE stuff requires a special router (expensive) and turned into a logistical mess - you're never sure if your ethernet outlet has power or not - if you have to ask, it usually doesn't.

In short, we haven't come up with a killer application that can benefit from this chipset.
Jed Marti
Esteemed Contributor
Tsuneo
Posts: 6,303
Registered: ‎02-15-2004

Re: C8051F120 + CP2200

Another drawback of CP220x is that it's I/O timing is too slow for full-speed of 'F12x-13x. When SYSCLK is set to 100 MHz, the I/O timing which EMIF can generates doesn't fulfill the CP220x requirement.

Read pulse width: 160 ns is require, and 160 ns max - no margin
Write hold time : 40 ns is required, but 30 ns is max

75 MHz SYSCLK is the maximum to satisfy Write hold time.

Code:

Table 26. Non-Multiplexed Intel Mode AC Parameters (CP2200.pdf rev0.41 p97)
TRD RD Low Pulse Width (Read): 160 ns (min)
TDH Data Hold Time (Write) : 40 ns (min)

Table 17.1. AC Parameters for External Memory Interface (C8051F12x-13x.pdf rev1.4 p233)
TACW Address/Control Pulse Width: 16 x TSYSCLK (max)
TWDH Write Data Hold Time : 3 x TSYSCLK (max)


Tsuneo

[This message has been edited by Tsuneo (edited June 29, 2007).]
Frequent Contributor
KevinH
Posts: 87
Registered: ‎07-21-2005

Re: C8051F120 + CP2200

Tsuneo,

The write data hold time you referenced is for multiplexed mode. The non-multiplexed mode timing is 20ns.

It had better work at 98MHZ - all of the examples that Silabs ship for the 'F120 are running at this speed! But you make a good point - the non-multiplexed chip would be required to be able to run at the 'F120's maximum speed.

[This message has been edited by KevinH (edited June 29, 2007).]