¿ªÔÆÌåÓý

Re: LTspice Genealogy - The Heritage of Simulation Ubiquity


 

--- In LTspice@..., "analogspiceman" wrote:

Segueing back to the history of LTspice, I just found a very in-
formative article that appeared in Electronic Design from October
of last year. In the middle, it has a long section in which Mike
Engelhardt recounts in a level of detail that I haven't seen
elsewhere, the technical history of the development of LTspice.

Here's the pertinent excerpt:

"Free Downloadable Spice Tools Capture And Simulate Analog Circuits"
by Don Tuite, Electronic Design, Oct. 23, 2012 (web edition).

How Engelhardt Made LTspice

Engelhardt is essentially the godfather of Linear Technology's freeware LTspice, and he has been supporting it and making it faster for 15 years. It all started because Bob Swanson and Bob Dobkin asked for a promotional tool. They got rather more than they bargained for because Engelhardt is the kind of engineer who obsesses about doing the job as thoroughly as possible. In this case, he wanted to make LTspice the world's fastest.

He also wanted to give Linear's chip designers the opportunity to make the best models. He points to Linear's current-mode switching supplies as an example. "A current-mode switch-mode power supply is basically a glorified flip-flop," he begins. "Something sets the flop, turns on a power switch, and a current is monitored, and when a condition is met, the thing is reset." As long as the designer gets that flip-flop working correctly, gets the "reset" condition correctly, and has the right transfer function between the compensation of voltage and peak current, "basically everything else is a detail."

This approach led Engelhardt to conclude that behavioral models are more useful than transistor-level models. "When we make analog models, we can avoid a bunch of numerical difficulties. When we get done setting up the system simultaneous equations that need to be solved, we have many fewer non-zero elements," he says. That makes it possible to solve the equations more rapidly.

Sometimes, this comes down to a black-box approach. Returning to that switching-supply example, Engelhardt says, "The main flip-flop of current-mode switcher is basically about a dozen gates. That's because you have to accommodate a maximum duty cycle and you want to set it with an edge and reset it with an analog condition." In terms of a fine-granularity model, implementation typically takes about a dozen gates. Instead of modeling that, Linear has a native circuit element that does that, and its behavioral model simulates faster.
"World's Fastest Spice"

Engelhardt says his approach to accelerating Spice evolved in an interesting fashion, and it required breaking some rules. "In the '70s when you were writing a numerical solver, and you were being disciplined about it, you would always keep firmly in mind that the fastest way to complete something was to not compute it all but to put it in a lookup table," he says, but that wasn't for him.

"That's the worst thing you can do. The computer CPU is hundreds of times faster than memory. Every time you use a lookup table, you have to fire those transistors in the RAM. You have to keep everything in cache," he says.

Surprisingly for someone who is talking about designing a Spice to run on PCs, Engelhardt turned to Seymour Cray for inspiration. Engelhardt says that matrix solving is a classic application of parallel processing.

"The Cray was about being able to take a row of a matrix, multiplying it all by one number and subtracting it from a number of rows," he says. Initially, in fact, Engelhardt had written a multi-threaded matrix solver and found it hundreds of times slower than a single-threaded matrix solver. So, he analyzed the actual time required to execute all of the instructions.

He found that the timing of the different threads was, in a sense, chaotic. It turns out to be better to use one thread than multiple threads because the timing between threads means it's impossible to get any speed up from using multiple threads. That insight sent him to the literature about multi-threaded sparse-matrix solvers, which confirmed what he was finding out.

"You can implement a multi-threaded solver that runs faster than a single-threaded solver if the matrix is exceedingly this sparse. But to solve the circuit matrix for a big IC, only a few parts out of a thousand are non-zero," he says. One problem was selling that concept in an engineering environment that expects a multi-threaded solver, so he told people that it was a multi-threaded solver, but the matrix itself was single threaded.

That wasn't the end, though. He then thought he could write a faster matrix solver. Engelhardt's next discovery was related to the number of clock cycles it takes to perform a floating-point operation. Benchmarking the kind of 3-GHz processors that were in common use, he figured each clock was 300 ps, and three of them were required to execute one floating-point operation.

"After the operation has been completed, if there's a long pipeline, you've got to wait to give the result to the processor at the end of the pipeline," he says. With Engelhardt's approach, the processor has to sit there and twiddle its thumbs for 900 ps before it can multiply two floating-point numbers to yield a 64-bit accurate result. He reckons that if you're counting floating-point operations, this makes a multithreaded architecture run at probably something like only 10% of its theoretical speed. So, he found a way to fix that¡ªanother technique based on the wise use of memory. The only important thing, he says, is how well you use the cache.

"When you're coding in a high-level language, much of the time when we refer to `X,' it's not X, it's the address of X. And, the programming language doesn't know where X is stored. So you have to ping the transistors on the motherboard a few times to get the actual bit pattern that's your double-position number. And that's unavoidable," he explains.

But here's the trick: "After you know where your data is, if you to look at all your addresses, at that point you could call an assembly language program that would access the data itself, keeping the pipeline full. And that's what LTspice does."

Effectively, after it loads all the data, LTspice finds a strategy for pivoting the matrix and solving it. Then it accesses the assembly language program, assembles it, links it, and just calls each address to solve the matrix. "And that sped up the process by a factor of three," he says.

When experienced circuit designers began to download LTspice and use it, they immediately noticed the difference from PSpice. They didn't see it immediately in terms of efficiency, but it was noticeably faster. However, Engelhardt also notes that LTspice is for circuit design. It's not intended to compete with an IC design tool such as Cadence's HSPICE. On the other hand, it's free, versus $1500.

Of course, the implementation of the modeling is only part of the story. The other part is the models. Engelhardt says that modeling power MOSFETs is tricky. "Power MOSFETs are hard to simulate in most Spice programs because a power MOSFET is a vertical structure. It has a drain on the back of the die. In contrast, in modeling an IC, everything is lateral and on the same side of the silicon," he says.

"We made a vertical MOSFET model where the gate and drain can be different sizes, and the channel is in between them, with the capacitance between the gate and the drain being dependent on whether the channel is enhancement or depletion. That charge model doesn't exist in other Spice programs," he says.

He says it's important to have that capability. The need for it became obvious when it became clear that with other Spice programs, "Switching waveforms from power MOSFETs didn't match what you see on the bench, and that's because this gate-drain capacitance introduces a Miller effect that dominates the switching characteristics," he says.

Join [email protected] to automatically receive all group messages.