Skip to content
January 24, 2012 / ecschroder

Un-Break My Heart

Happy deployments are all alike; but every unhappy deployment is unhappy in its own way. (That’s how it goes, right?) Way back in the day, once upon a time, I had a happy deployment. But now…

Issue #1

A little while ago, on the advice of a friend, I split settings.py into a global settings files (settings.py) and local settings file (settings_local.py). Global settings can live in the repo, while local settings stay at home. Pros: by splitting the files, I could keep certain settings–like SECRET_KEY–private. Cons: by splitting the files, I screwed up my deployment on DjangoZoom. Oops.

Issue #2

I think I need to revisit my static files set up. I’m still a little fuzzy on the details. When I try to deploy on DjangoZoom, the build fails on the ‘manage.py collectstatic –link –noinput’ command. Here’s a snippet of the error:

ErrorInsideEnvironment: Command [u'/cust/p00000624/bundle_p00000624_2012-01-24-16.10.36/thisbundle_build.py', 'collectstatic', '--link', '--noinput'] returned non-zero exit code 1.
...
OSError: [Errno 2] No such file or directory: '/cust/p00000624/bundle_p00000624_2012-01-24-16.10.36/user-src/remake/media/common'

Next on the reading list:

http://stackoverflow.com/questions/4565935/django-staticfiles-app-help

https://docs.djangoproject.com/en/dev/howto/static-files/

Advertisement
December 12, 2011 / ecschroder

Views, Transactions, Recursion, OLAP

These are some notes from the database class that Stanford offered online & free. Good times…

Views

Consider this three-level vision of databases:

Real world applications have lots of views. Views can be used to:
– hide some data from some users
– make some queries easier / more natural
– modularize database access

Views are important for customizing user authentication. Using views, it’s possible to give a user only to the tuples that meet certain criteria. (Views with aggregation are not possible.)

Modifications to views are a little tricky. Modifications must be rewritten to specify how they will modify the base tables. There are two ways to handle view modifications.

(1) Rewriting process specified explicitly by view creation.
PRO: can handle all modifications
CON: no guarantee of correctness (or meaningfulness)

(2) Restrict views and modifications so that translation to base table modifications are meaningful and unambiguous.
PRO: no user intervention
CON: restrictions are significant

Aggregation views are used for queries only. It doesn’t make sense to modify the aggregation view. What sort of modification would you make to the base tables? For example, to change the average GPA from 3.5 to 4.0, whose GPA would you increase?

Likewise, projections and distinct are keywords that can only be used in view queries, but not modifications.

Instead-of triggers are used to catch modifications to views and translate them into meaningful modifications to the underlying base tables.

Authorization

On a UNIX file system, the basic privileges are read, write and execute. The relational model is more complex than the UNIX file system, so it makes sense that there were be different (and more complex) types of privileges.

Types of privileges

There are nine types of privileges in SQL: SELECT, INSERT, DELETE, UPDATE, REFERENCE, USAGE, TRIGGER, EXECUTE, and UNDER.

SELECT, INSERT, DELETE and UPDATE are the basic privileges that apply to a relation.

REFERENCE is the right to use a relation in an integrity constraint.

EXECUTE is the right to execute code (for example, a persistent stored module).

UNDER is the right to create subtypes.

Obtaining privileges

The creator of a relation is the relation’s owner. The owner has all privileges and may grant privileges to others.

GRANT privilege_name
ON object_name
TO {user_name |PUBLIC |role_name}
[WITH GRANT OPTION];

The grant option allows the user who received privileges to grant the same privileges to others. A user may revoke privileges with keywords CASCADE or RESTRICT.

Transactions

What happens if the system fails in the middle of a set of changes? How should sets of changes be interleaved?

A transaction is a collection of one or more operations on the database that must be executed atomically. In other words, all of them are done or none of them are done.

SQL defines four different isolation levels. From least stringent to most stringent, they are: read uncommitted, read committed, repeatable read, and serializable.

Recursion

SQL is not a Turing complete. In other words, its not possible to solve any computation problem with the data-manipulation rules available in SQL. However, SQL has the benefits of being simple, convenient and declarative. It is expressive enough for most database queries. Basic SQL cannot express unbounded computations.

One type of problem that is tricky in SQL: how to find all of Mary’s ancestors, given the relation ParentOf(parent,child)? If we know how many generations out data encompasses, it’s not problem. (We can simply use a SELECT statement UNION another SELECT statement, rinse and repeat once for each generation.) The trickiness comes into play when we don’t know how many generations are involved.

The SQL With statement is useful for recursive problems. It’s like setting up temporary views.

With R1 as (query-1),

R2 as (query-2),

Rn as (query-n)

<query involving R1, …, Rn (and other tables)>

Using the With statement, we can write recursive queries. Here’s an example:

With Recursive
R as (base query
Union
recursive query)
<query involving R (and other tables)>

Here’s an example of a linear recursive query based on the relation Flights(airline,frm,to,departs,arrives):

WITH RECURSIVE Reaches(frm,to) As
(SELECT frm, to FROM Flights
UNION
(SELECT R1.frm, R2.to
FROM Reaches R1, Reaches R2
WHERE R1.to = R2.frm)
SELECT * FROM Reaches;

This same query, written in datalog, looks like this:

[1] Reaches(x,y) ← Flights(a,x,y,d,r)
[2] Reaches(x,y) ← Reaches(x,z) AND Reaches(z,y)

(In human-speak, rule [1] gives us all of the direct flights, while rule [2] gives us flight combinations that include a layover. It’s a classic recursion problem.)

The first SELECT statement in the above query corresponds to datalog rule [1]. The second SELECT statement (the recursive part) corresponds to datalog rule [2].

Note: Remember, the Union operator in SQL automatically eliminates duplicates. That’s different from how the Union operator works in relational algebra. Relational algebra is based on the set model, while SQL is based on the multiset model.

Linear recursion allows only one reference to R in the recursive statement. If there are multiple references to R in the recursive statement, it is said to be non-linear. Non-linear recursion has some benefits. Namely, queries look cleaner and they converge faster. However, non-linear recursive statements are harder to implement.

Say we have a graph with one linear path through 1000 nodes. If we want to get all nodes using a linear statement, it will add nodes one at a time and require 1000 steps. If we use a non-linear statement, nodes will be added in just log2(1000) = 10 steps.

The SQL standard only requires linear recursion. It is implemented only in PostgreSQL (not MySQL or SQLite). Non-linear recursion is not supported by any of these three DBMS systems. In fact, it’s not even part of the SQL standard.

Hubs and Authorities was a nifty little algorithm developed around the same time as Google page rank for searching web pages.

A unary relation is a relation with only one attribute.

A statement S is a monotone if S adds one or more tuples to a relation R, or leaves R unchanged, but does not delete from R. (Remember, from differential eq., monotonic functions are entirely non-increasing or entirely non-decreasing. The first derivative does not change the sign. Similarly, monotonic sequences are such that ai+1 > a1 for every i > 1 or ai+1 < a1 for every i < 1.)

On-Line Analytical Processing

There are two broad classes of database activity. Online Transaction Processing (OLTP) consists of short transactions and simple queries that touch small portions of data. They imply frequent updates. On the other hand, Online Analytical Processing (OLAP) consists of long transactions and complex queries that touch large portions of the data. They imply infrequent updates.

OLTP and OLAP are not completely discrete. In fact, we can think of a spectrum of transactions between short and simple (OLTP) and long and complex (OLAP).

Data warehousing is the concept of bringing data from OLTP sources into a single “warehouse” for OLAP analysis.

Decision support systems (DSS) …

A star schema consists of one large fact table with many smaller dimension tables. The fact table is updated infrequently and is only append-only. The dimension tables are updated frequently.

Two types of queries were added to SQL to handle OLAP.
– with cube is not yet implemented in any DBMS.
– with rollup is implemented only in MySQL. It is similar to summaries in Excel pivot tables. With rollup is useful for showing the hierarchical structure of attributes. (For example, attributes state, county, city can be “rolled up” into state.)

For large-scale datasets, using summary tables from the cube can be orders of magnitude faster than using data tables themselves.

December 12, 2011 / ecschroder

Querying XML, UML, Indexes, Constraints and Triggers

These are some notes from the database class that Stanford offered online & free. Good times…

Querying XML

XSL = Extensible Stylesheet Language
XSLT = XSL (with) Transformation

XSLT performs rule-based transformations. It can be used to:
– match templates and replace
– recursively match templates
– extract values (using XPath)
– iteration (for each)
– conditions (if)

Watch out for some strange whitespace behavior.
XPath has an implicit template priority schema.

Here’s a snippet of XSL that finds all Title attributes of Book elements in an XML document and returns them within a set of <BookTitle> tags:

<xsl:stylesheet ….. >
<xsl:template match=”Book”>
<BookTitle><xsl:value-of select=”Title”></xsl:value-of>
</xsl:template>
</xsl:stylesheet>

Note: If data doesn’t match any of the templates, it will return a concatenated string of all leaf elements. To fix that (i.e. to remove the extraneous string), do this:

<xsl:template match=”text()”>
</xsl:template>

That way, the extra leaf elements will match this template element. Since we left the template blank, the extra string won’t appear in the output.

If part of an XML document matches two different templates, the processor will use the second template. Except, more-specific rules are considered high-priority. (That’s sorta similar to CSS.)

Here’s an example of a rule with a condition:
<match=”node[@Price &lt; 90]”>
Note that we used &lt; to represent <.

match=”/” returns all nodes from the root element

match=”*|@*|text()” returns everything

Recursively apply templates

If you want to restructure the entire XML document, this generic template will help. It returns a generic copy of everything.

<xsl:template match=”*|@*|text()”
<xsl:copy>
<xsl:apply-templates select=”*|@*|text()” />
</xsl:copy>
</xsl:template>

You can also use conditionals (if) and iterations (for each) in templates. That’s useful for turning XML into HTML.

Unified Modeling Language

Higher-level database design models include Entity Relationship Model (E/R) and Unified Modeling Language (UML). Both of these design models are graphical. Both can be translated to relational models automatically.There are five key concepts in UML:
– classes
– associations
– association classes
– subclasses
– composition and aggregationClasses consist of a name, attributes, and (optionally) methods.

Associations are the relationships between objects of two different classes. Association multiplicity can be one-to-one, many-to-one, many-to-many, or complete.

Example:
Each object of class C1 is related to at least m and at most n objects of class C2.

Association classes are unnecessary if the multiplicity is 0..1 or 1..1. Otherwise, we can represent association classes like this:

Subclasses are a specialization, whereas superclasses are a generalization. Subclasses can be incomplete (or partial) or complete. They can also be disjoint (or exclusive) or overlapping.

Composition and aggregation are used to describe how objects of one class belong to objects of another class. In this example, every department belongs to a college. Some, but not all, apartments belong to a college.

UML to Relations

UML can be translated directly to relations, provided that every “regular class” has a key. (Subclasses are not regular classes.)

Classes to relations

Straightforward, just like you’d think.

Associations to relations

Take the primary key from both sides.

Association classes to relations

Depends on multiplicity. There’s a rule:
If 0..1 is on the left side, then primary key from the right side is the key of the association relation.

Subclasses to relations

Can be translated many different ways. The best translation depends on the properties. A few schemas include:
1. Subclass relations contain superclass key and specialized attributes.
2. Subclass relations contain all attributes.
3. One relation containing all superclass and subclass attributes.

The cardinality |C| of a class C is equal to the number of objects in the class.

Indexes

When querying a database, indexes make the difference between scanning the entire table and the immediate location of tuples.

Underlying data structures for indexes are:
– Balanced trees (a.k.a. B trees, B+ trees)
– Hash tables

Balanced trees can be used for locating attributes with values in a certain range, like att = value, att < value, or value < att < value.

Hash tables, on the other hand, run in real time. They can only be used for att = value.

Many database management systems build indexes automatically on primary keys.

There are some downsides to indexing:
– extra space (not really a big deal, just a marginal concern)
– overhead in index creation (a medium-sized concern)
– index maintenance (a large concern, since index maintenance can offset the benefits of indexing altogether)

A physical design advisor can be helpful for choosing which attributes to index. With an input of database stats and planned workload, it gives an output of recommended indexes.

SQL syntax

CREATE INDEX index_name
ON table_name (column_name)

Constraints and Triggers

Constraints and triggers for relational databases are part of the SQL standard. Constraints, as the name suggests, constrain the allowable states of the database. Triggers monitor changes to the database. When changes occur, triggers check conditions and initiate actions.

Integrity constraints impose restrictions on allowable data, beyond the restrictions imposed by structure or type. There are several different types of integrity constraints, including non-null contraints, key constraints, attribution-based and tuple-based constraints, general assertions, and referential integrity constraints (also known as foreign key constraints).

Triggers are also called event-condition-action (ECA) items. When an event occurs, check a condition. If it’s true, then perform this action. Triggers can be used to move logic from the application to the database itself. You might think of them as a special type of stored procedure.

Triggers vary by implementation. (In other words, MySQL, SQLite and PostgreSQL handle triggers differently.)

CREATE TRIGGER name
Before | After | Instead of events
[referencing-variables]
[FOR EACH ROW]
WHEN (condition)
action

Triggers can reference variables from the old and new tuples. For example, you might include the line the following line if you need to make a reference to the old tuple within the trigger:
OLD as old, NEW as new

Triggers can be row-level (using the “FOR EACH ROW” keyword) or statement level (without the keyword).

Some tricky issues with triggers include:
– Knowing whether to use a row-level or statement-level trigger.
– When multiple triggers are activated at the same time, which one should go first?
– Trigger actions that activate other triggers (chaining). Also: self-triggering, cycles, nested invocations.
– Conditions in the WHEN clause vs. as part of the action clause.

December 12, 2011 / ecschroder

Relational Algebra, SQL, Relational Design Theory

These are some notes from the database class that Stanford offered online & free. Good times…

Relational Algebra

Relational algebra forms the basis of database query languages. Operations on a relation produce a relation. The concept was known, but ignored, until E.F. Codd introduced the relational model in the 1970s.

Relational algebra can be used on finite relations only (unlike relational calculus, which allows operations on an infinite formula).

Certain operations are primitives of relational algebra. These core operations were defined by Codd. Primitive operations are: selection, projection, Cartesian product, set union, set difference, and rename. Other operations are derived from these six operations. The abbreviated operations are: intersection, division, natural join, and many more.

SQL

Here’s a basic select statement in SQL. This example use relations Student(sID, sName, GPA, sizeHS) and Apply(sID, cName, major, decision).

SELECT sName, major
FROM Student, Apply
WHERE Student.sID = Apply.sID

Relational algebra constitutes the formal underpinnings, while SQL is a practical implementation. There are differences. For example:
– Relational algebra, which is based on the set model, allows duplicate values.
– SQL, which is based on the multiset model, does not allow duplicates.

Datalog

Database logic is called datalog and consists of if/then rules. The rules return TRUE or FALSE. For example, R(a1, a2, … , an) is TRUE if R(a1, a2, … , an) is a tuple in R. (See also: predicates, atoms, head, body, subqueries.)

Relations are represented by predicates. A predicate the name of a function that returns some (Boolean) values. An atom is a predicate followed by its arguments.

Given a relation R(a,b) with tuples (1,2) and (3,4), we can say

R(1,2) = TRUE
R(3,4) = TRUE
R(5,7) = FALSE

Predicates can take variables or constants. Using the example above, R(1,z) = TRUE if z=2.

The basic syntax of datalog is:

head ← body

The body consists of several subqueries, linked together with AND or NOT operators. Using datalog rules, we can construct new tuples with the tuples that satisfy subqueries of the body. Imagine iterating through all of the values in the body. If some combination of values makes the body = TRUE, then add that tuple to the relation whose predicate is the head.

SQL in a Server Environment

You can embed SQL in a program written in ordinary programming languages (e.g. C). One critical issue: how to move data between SQL relations and variables of the “host” language.

Persistent stored modules are pieces of code that are stored in the database and executed by the user.

A call-level interface allows you to program in conventional languages, and use a library of functions to access the database.

Three-tier architecture

A three-tier architecture looks something like this:


Database servers run database management systems (DBMS) and perform queries and modifications.
Application servers run the business logic of the system.
Web servers connect clients to the database system.

Relational Design Theory

Properties and Normal Forms

Normalization is the process of reducing redundancies in relational database design. Some normal forms are 3NF, Boyce-Codd Normal Form, and 4NF. Of those three, 3NF is the least restrictive and 4NF is the most restrictive.

Evaluate functional dependencies to test for Boyce Codd Normal Form.
Evaluate multivalued dependencies to test for Fourth Normal Form.

Functional Dependencies and Boyce Codd Normal Form

Relations are said to be in Boyce Codd Normal Form if, for every functional dependency A → B, then A is a key.

Functional dependencies are a generally useful concept.
– data storage, compression
– reasoning about queries, optimization
– generalization of the concept of keys

Definition of Functional Dependency
For every t,u in R
If t.A = u.A then t.B = u.B
Then A → B

Functional dependencies are based on knowledge of the real world. All instances of a relation must adhere to them.

Multivalued Dependencies and 4NF

Relations are said to be in Fourth Normal Form if, for every multivalued dependency A →→ B, then A is a key.

Rules to know:
– Splitting rule
– Combining rule
– Trivial dependency rules
– Transitive rule
– Closure of attributes

Closure and Keys

Question:  Is  a set of attributes A a key for R?
Answer: Compute the closure of A, called A+ . If it’s equal to the set of all attributes, then A is a key.

Specifying Functional Dependencies (FDs) of a relation

S1 and S2 are sets of functional dependencies.
S2 “follows from” S2 if every relation instance satisfying S1 also satisfies S2.

S2 : { SSN → priority }
S1 : { SSN → GPA, GPA → priority }

(Hey linguists, computing closure is sorta like making sentences from phrase structure trees in syntax. Start with the head, then use sets of rules–ehem, functional dependencies–to add branches.)

How to test? Does A → B follow from S?
1. Compute the closure A+ based on functional dependencies S. Check if B is in the set.
2. Armstrong’s Axioms

Want: the minimal set of completely non-trivial FDs such that all FDs that hold on the relation follow from the dependencies in this set.

Boyce-Codd Normal Form

a.k.a. 3 ½ Normal Form

Decomposition of a relational schema…

Relational design by decomposition
– “mega” relations and properties of the data

BCNF decomposition algorithm

Input: Relation R and Functional Dependencies for R
Output: Decomposition of R into BCNF relations with “lossless join”

1. Compute keys for R (using FDs)
2. Repeat until all relations are in BCNF:
Pick any R’ with A → B that violates BCNF.
Decompose R’ into R1(A,B) and R2(A,rest)
Compute functional dependencies for R1 and R2
Compute keys for R1 and R2

Note: Sometimes you don’t want to decompose relations into BCNF. Larger relations can be useful, because you won’t have to join then back together for complicated queries. But, you have to take the query load into consideration.

Multivalued Dependency

Multivalued dependencies are based on knowledge of the real world data that is being captured by the database. All instances of the relation must adhere.

[ see notes for diagram ]

Multivalued dependencies are sometimes called tuple-generating dependencies.

Know these rules:
– FD-is-an-MVD rule
– Intersection rule
– Transitive rule

Fouth Normal Form

Relation R with MVDs is in 4NF if, for each nontrivial A →→ B, A is a key.

Summary

Question: Why is a functional dependency called “functional”?
Answer: Because we can imagine some function f such that f(movie,year) = length. In other FD terms, movie, year → length.

Here’s a tip for decomposition: start by decomposing the largest functional dependency first. You’ll get a better design in the end.

Remember, decomposed != better design. You may need to access certain attributes together when querying the data, and the decomposed form will necessitate many joins. It may be better to leave those attributes together in the same relation.

Multivalued dependencies are a subset of functional dependencies.

Functional dependencies and BCNF : When we have relation R(A,B,C) with functional dependency A → B,  we can factor out dependencies so we don’t repeat them over and over.

Multivalued dependencies and 4NF : relation R(A,B,C,D) with multivalued dependencies A →→ B. Consider the multiplicative effect of combinations. Put those in a separate relation.

4NF is a stronger design that BCNF.

When designing a database schema, remember that many designs are possible. Some are much better than others. Relational design theory helps us plan our schema. Normal form can lead to “good” relations. We learned how to design by decomposition. Usually, this process is intuitive and works well. But, it does have some shortcomings including dependency enforcement, query workload and over-decomposition.

December 12, 2011 / ecschroder

Notes: Relational Databases, XML Data

Phew, just finished the database class that Stanford offered free! online! this semester! Good times. A few notes…

Introduction

Data manipulation languages (DML) are used to query and modify databases. They let you do ad-hoc queries in high-level languages.

Queries are performed on relations, and they return relations. (In other words, they are compositional and closed.)

Query languages include SQL and XPath. SQL is based on the principles of relational algebra. We could say that relational algebra is a formal query language, while SQL is an actual/implemented query language.

Relational Databases

A database consists of a set of relations (or tables). Each relation has a set of named attributes (or columns). Each tuple (or row) has a value for each attribute. (The value could be NULL, or empty.)

A data schema is the structure of the database; an instance is the data itself.

NULL is a special type of value. It means “unknown” or “undefined.” Be careful when querying data sets with null values; null values don’t satisfy the normal query rules. In other words, neither of these statements is true:

* NULL > 3.5

* NULL ≤ 3.5

If you were looking for records of a database which satisfy the condition GPA > 3.5, you won’t find Sally Sue, who has an unknown GPA (NULL). But she won’t turn up when you query for GPA ≤ 3.5, either.

A key is an attribute for which every value is unique. Or, to be more precise, a key is a set of values, for which every combination is unique. Lots of database systems will automatically add an ID attribute to serve as the key for a relation.

XML Data

XML: Introduction, Well-Formed XML

XML is a standard format for data representation and exchange. XML also has a streaming format. The formal spec for XML is enormous, so this course will only cover the most important concepts.

The basic constructs of XML are: tagged elements, attributes, text (a.k.a. “leaf elements of the tree”).

Here’s how the Relational Model and XML compare:

Relational Model XML
Timeline Has been around for a while. A lot newer than the relational model.
Structure Tables Hierarchy
Queries Simple, nice languages (well-developed over time) Less so…
Ordering None Implied
Implementation Native (mature systems handle it natively) Add-on (hasn’t been around very long)

Well-formed XML adheres to basic structural requirements:
– single root element
– matched tags, proper nesting
– unique attributes within elements

Displaying XML is a snap. Use rule-based language to translate it into HTML. Then apply styles via:
– cascading style sheets (CSS)
– extensible stylesheet language (XSL)

DTDs, IDs and IDREFs

DTDs can be written in the same document as the XML, or it can be kept in a separate file.

XML documents must meet some criteria to be Well-formed or Valid:

Well-formed XML Valid XML
Single root element Meets criteria of well-formed XML, plus…
Matched tags, proper nesting
Unique attributes within elements
Adheres to a content-specific spec (such as DTD or XSD)

Document Type Descriptors (DTD) are a grammar-like language used for specifying elements, attributes, nesting, ordering and the allowed number of occurrences of certain elements.

Reasons to use DTD/XSD (= benefits of typing):
– programs can assume that there’s a structure to the data
– CSS/XSL can assume that there’s a structure to the data
– there will be a specification for data exchange
– it makes documentation easier

Reasons not to use DTD/XSD (= benefits of no typing):
– flexibility, ease of change
– DTDs can be messy, irregular
– XDS can be very messy

Here’s an example of a DTD:
<!DOCTYPE Bookstore [
<!ELEMENT Bookstore (Book*, Author*)>
<!ELEMENT Book(Title, Remark?)>
<!ATTLIST Book ISBN ID #REQUIRED
Price CDATA #REQUIRED
Authors IDREFS #REQUIRED>
<!ELEMENT Remark (#PCDATA | BookRef)*>

There are some differences between XML schemas and DTD. For example, in XML schema, you can have typed pointers. Not so in DTD.

XML Schema

XSD is written in XML, with special tags. It’s an extensive language. Like DTDs, XSD can specify elements, attributes, nesting, ordering and the allowed number of occurrences of a certain element.

XSD is always kept in a separate file. When using an XSD, you’ll need two files: a schema file and a data file. On the other hand, DTD can be written in the same file (at the top) or in a separate file.

There are 4 features of XSD that are not found in DTD:
– typed values
– key declarations
– references
– occurrence constraints

November 25, 2011 / ecschroder

Got a mac =)

Got a mac! Life just got better. Thank you Jacob, for the deal & for the protips. Going through some set up stuff now. Things on my mind:

Set up virtual environments

Say you need to use Python 2.6 for one project; Python 3.1 for another. Each time you switched between projects, you would need to switch back and forth between Python installations. Instead of changing your default PYTHONPATH settings back and forth, you can work within a virtual environment.

First, install the virtualenv package. (Great instructions here.) Then use these commands to create and workon virtual environments.

mkvirtualenv --no-site-packages [name of virtualenv]

…creates a new virtual environment, but doesn’t copy everything else that’s in the root.

workon [name of virtualenv]

…lets you work within a certain virtual environment.

(test_environ)Emmas-iMac%

…reminds you that you’re in a virtual environment.

Connect to GitHub

Create a new SSH key and connect it your GitHub account. Instructions here.

Get homebrew

Get homebrew and use it to install packages. Stuff that’s not Python, I mean. Don’t use homebrew to install Python. It will screw with your life. (Homebrew will always grab the latest version of everything, and you don’t want that for Python.)

Install PIL

In my Trash or Treasure project, each item in the database has an image. When I try to add a new item through the admin interface, I get this message:

Upload a valid image. The file you uploaded was either not an image or a corrupted image.

It seems that Django can automatically handle PNG images, but can’t do anything with JPG or GIF unless it has its sidekick PIL (Python Imaging Library). This is where Mac ownership is going to pay off. I seem to remember an afternoon wasted installing PIL on Windows. On Mac, I’m guessing it’s just:

brew install pil

It works. OMG that is so easy. [Note to self: need to install packages in each virtual environment, or once for good?]

Managing shared files

Most of my project files are on GitHub, so sharing is all set. But assets and images won’t live in the code repo, so I need another way to share them between computers. Dropbox is super easy…winning. Just need to copy over a few static files and…that’s it.

November 8, 2011 / ecschroder

First look at user auth

Now that the remake is all finished*, I can venture out into new Django territory. Users! From the documentation, it sounds like Django user auth is quite simple (viz. “included” “free” “out-of-the-box”). Still, here’s a list of vocab that I don’t know: middleware, sessions, salt and hash. Spoon, please.

This post was a great read: http://www.b-list.org/weblog/2007/nov/21/auth/ **

* In fact, voting doesn’t work in the remake. I figure that functionality will change a lot after I add users, so I didn’t bother with it now.

** I do not forgive you for that terrible card trick, James.

{% csrf_token %} ftw!

When I first heard about Cross Site Request Forgery protection, and its handy template token {% csrf_token %}, I thought: ruh-roh. What if I forget that?

I was thus happy to see this error:

screenshot of error message

You mean, Django is going to tell me if I #*@( up? Sweet!

Hmm. Lots of problems with the first tutorial. Let’s try the Django docs again: https://docs.djangoproject.com/en/dev/topics/auth/#authentication-in-web-requests

Better: http://rdegges.com/user-authentication-with-django

Rdegges is my hero.

Thank you for the well-written and informative tutorial on User Authentication in Django. It works!

How exciting is that?

You can see the deployed version here: http://p00000578.djangozoom.net

Up next: Part II ..:or:.. Welcome to the party, users.

Coming next…integrate user auth + new login page template on my Trash or Treasure site. (Sorry PyLadies, I haven’t forgotten you & your awesome support. Temporarily deferred while db-class heats up. =)

October 24, 2011 / ecschroder

Remake Part 2

In this post, I’ll finish up the remake of my first Django project. I made a lot of mistakes the first time around, so I’m starting over and trying to do it better.

Remake Part 1 went like this:

  • Started the project, created a repo
  • Set up the database
  • Created models.py

Next, I gotta:

  • Write URLs
  • Write views
  • Create templates

4. Write URLs.

First, I’ll enable the admin view by uncommenting a few lines in urls.py. Also need to add admin to the list of INSTALLED_APPS in setttings.py. Running the server, checking the admin view at 127.0.0.1:8000/admin, et voilà. It works.
Oops, I don’t see anything about KnickKnacks in the admin view. Need to follow these instructions from the tutorial to make the KnickKnack app available in the admin interface. Within the trashure app, I created an admin.py file. It reads like this:
from trashure.models import KnickKnack
from django.contrib import admin

admin.site.register(KnickKnack)
Restarting the dev server and checking again…eccolo, Knickknacks appear in the admin interface.
Second, I’ll write URLs for two different views: index and detail. I want this app to be stupid-easy to read, so I’ll write the tuples like this:
	url(
		regex = '^trashure/$',
		view = 'trashure.views.index',
		name = 'index',
	),

	url (
		regex = '^trashure/(?P<knickknack_id>\d+)/$',
		view = 'trashure.views.detail',
		name = 'detail',
	),
Of course, those URLs won’t resolve yet, because the views that they will call don’t actually exist. But I suppose I could test them anyway. I’m expecting to see a “View does not exist” error, or something like that.
Testing 127.0.0.1:8000/trashure, got a NameError: “name ‘trashure’ does not exist.” Is that what I was expecting? Yes, close enough.

5. Write the views.

Right now, views.py exists but is a blank slate. As far as I understand, views.py is where you’ll do all of your calculations and queries. It’s going to grab stuff from the database, and spit that into the templates.
As a reminder, the KnickKnack relation (which we’ve defined in models.py during Remake Part 1) goes like this:
KnickKnack (name, photo, description, memory, year, era, pub_date, votes_treasure, votes_trash, score)
Hmm, no primary key? (Oh wait…Django adds an ID (primary key) automatically.)
For the record, all of this makes a hell of a lot more sense after a few weeks of Stanford’s online database class.

Detail view

The detail view is the simpler of the two views, so I’ll start there. There are basically two sections of the detail view that are coming from the database:
  • 1 full KnickKnack, with its name, photo, description, etc.
  • 4 random KnickKnacks, but just the photo. (“Related Trash,” that is.)
In other words, we’ll need to return those things into a context, like this:
#...
return render_to_response(
		'trashure/detail.html',
		dictionary={
			'knickknack': p,
			'relatedtrash': relatedtrash,
		},
		context_instance=RequestContext(request)
#...
Of course, we’re working backwards here, from the bottom up. Now that we know what we’re returning from views.py, we can construct those things. To get a knickknack, we’ll need two attributes: (1) a request (HTTP request, I guess?) and (2) a knickknack_id, which we parsed out of the URL in urls.py. It will go like this:
def detail(request, knickknack_id):
p = get_object_or_404(KnickKnack, pk=knickknack_id)
#… 
To get some “Related Trash,” we will
  • grab all of the KnickKnacks
  • order them randomly
  • take the first four of those random objects
It goes like this:
randomtrash = KnickKnack.objects.order_by('?')[:4]
Now I’ll add the required import statements to the top of views.py. We’ve used these packages, so need to import ’em:
from django.template import Context
from django.template import loader
from django.template import RequestContext
from django.shortcuts import get_object_or_404
from django.shortcuts import render_to_response

from trashure.models import KnickKnack

Index view

The index view will show a few different pieces of the pie:
  • three random objects
  • two “Featured Treasure” objects
  • four “Featured Trash” objects
First, we’ll define our index view. This time, we’re only passing one attribute, request. We’ll construct the aforementioned sets like this:
def index(request):
    randomtrash = KnickKnack.objects.order_by('?')[:3]
    toptreasure = KnickKnack.objects.order_by('-score')[:2]
    toptrash = KnickKnack.objects.order_by('score')[:4]
    # ... 

If we want to order_by a negative score, we'll have to import the reverse function, like this:

from django.core.urlresolvers import reverse
To finish up, we’ll pass those sets into the context dictionary, and spit them into a template:
return render_to_response(
    'trashure/index.html',
    dictionary={
        'randomtrash': randomtrash,
        'toptreasure': toptreasure,
        toptrash': toptrash,
        },
        context_instance=RequestContext(request)
    )

You can see the code at the end of Step 5 on GitHub.

6. Incorporate the templates.

I’ve already prepared two static HTML pages and the all-important CSS stylesheet. First, I’ll set up the directory structure so that I can put everything in the right places in my project. Then I’ll swap out some static parts of the HTML templates and replace them with template tags. (A little bit of background reading before this step; definitely didn’t catch everything the first time.)

That was some very nice reading. Got some notes.

Variables
dot tries, in this order:
1. dictionary lookup
2. attribute lookup
3. method call
4. list-index lookup
{{variable}}
{{variable.something}}

Filters
30+ built in filters
{{ name | lower }}

Tags
{% for %}{% endfor %}
{% if %}{% else %} {%endif%}
{%block%}{%extends%}

Comments
{# single line comment #}
{%comment%} multiline comment{%endcomment%}

Set up the directories to images, CSS and JS

Let’s see if I can remember how this goes…. Nope. Where do I put static files? I’m uploading a few “knickknacks” through the admin interface so I’ll be able to see where the new media/ folder gets created. I’ll put static/ at the same level. It will go like this:

remake/
    trashure/
    media/
    static/
        img/
        css/
        js/

I’m pretty lazy and it’s 1 am, so I’m just copying everything in static/ over from the first project. Making a few tweaks (new variable names in the remake, etc.) Testing it out.

Oops, no stylesheet is showing. I must have forgotten something when setting up the directories. Indeed…forgot to change this part of settings.py:

STATICFILES_DIRS = (
	'C:/remake/static',
)

I don’t understand why I need to set an absolute path here. Why does that still work when it’s deployed online? Does Django-Zoom (or similar) automagically change the path to something relative during the deployment process?

New problem, new solution

In the first project, I choose to represent the “year” attribute as a string, or CharField. That is, models.py had the line:

year = models.CharField(max_length=4)

This (slightly-less-totally-ghetto) time around, I used a DateField for the “year” attribute.

year = models.DateField('year acquired')

Before, Django would display the four digits of the year, but only because they were a string. Now, the attribute “year” is the datetime data type, and Django is displaying it in the date format: Mon DD, YYYY. Since I only want to show the year, I can add a template filter. Within the detail.html, changing {{ knickknack.year}} to {{ knickknack.year|date:”Y”}} , and the output is tout simplement YYYY.

Where are the * pictures?

Gah, where are the #$%#images? Getting the path to the images was super confusing last time.

Adding this chunk to the bottom of urls.py does the trick (thanks Jacob!):

if settings.DEBUG:
	urlpatterns += patterns('', url(
		r'^media/(?P<path>.*)$',
		'django.views.static.serve',
		{'document_root': settings.MEDIA_ROOT, }
	))

The remake is finished! You can see the code at the End of Step 6 on GitHub. Or the code at the End of Step 6 for real. Oops.

On to users!

October 22, 2011 / ecschroder

Lightning Talk: Hello World

Site:

http://bit.ly/oYUjnI

Code:

https://github.com/ecschroder

Blog:

https://djangolookslikefun.wordpress.com/

Starting point:

  • some HTML, CSS
  • a tiny bit of Python
  • nothing about databases

Lots of problems:

  • Setting up dev environment on Windows
  • Getting used to Git
  • General n00biness

Lots of help:

  • PyLadies
  • DjangoCon
  • Documentation!

Notable moments in history:

  • Deploying on DjangoZoom
  • Code review from PyLadies
  • Starting over

Next steps:

  • Finish Remake Part 2
  • Set up OAuth :)
October 5, 2011 / ecschroder

Database scavenger hunt

The first time you run the command python manage.py syncdb, Django creates a database based on your models.

Um, okay. Cool.

I’m completely new to databases, so that doesn’t really mean much to me. Today I’ve only got a few minutes to play, so I’m going to try to see what really happens when you sync the database.

Reading: Introduction to Relational Database Design

Looking forward to: Stanford’s Introduction to Database course, online & free (begins October 10, 2011)

Let’s syncdb.

I’ve already created my database, but haven’t added any records yet. Seems like a good time to delete it & take it from the top.

C:\remake>python manage.py syncdb
Creating tables ...
Creating table auth_permission
Creating table auth_group_permissions
Creating table auth_group
Creating table auth_user_user_permissions
Creating table auth_user_groups
Creating table auth_user
Creating table auth_message
Creating table django_content_type
Creating table django_session
Creating table django_site Creating table trashure_knickknack

You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): yes
Username (Leave blank to use 'emma'):
E-mail address: fake_address@fake.com
Password:
Password (again):
Superuser created successfully.
Installing custom SQL ...
Installing indexes ...
No fixtures found.

So…what just happened?

First of all, we’ve got a new file in our project folder: objects.db. What’s going on in there? If only we could see inside….

Joe told me about SQLite Database Browser, which is open-source tool for creating & editing databases. We can use the tool to open the database and take a look. Here’s what I see:

Look kids, it's a database!

I see lots of tables: auth_permission, auth_group_permission, trashure_knickknack, etc. These tables bears a striking resemblance to the tables that Django told us it was creating when we ran python manage.py syncdb. Hmm….

One of those tables–trashure_knickknack–is my own creation. I wrote that model in my last episode. All the other ones? I have no idea where they came from or what they are.

Scavenger hunt for auth_permission.

Let’s look at one of the tables: auth_permission. As we can see in the screenshot, that table has four fields: id, name, content_type_id, and codename. Where did it come from? Let’s go on a scavenger hunt in the Django source code. Maybe we’ll find some clues.

Where did it come from?

In our last episode, we created the model for the trashure_knickknack table. To finish that process, we had to add trashure to a list of INSTALLED_APPS, which was found inside settings.py. Maybe we can start there?

Looking in settings.py, we see the entire list of INSTALLED_APPS goes like this:

INSTALLED_APPS = (
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.sites',
    'django.contrib.messages',
    'django.contrib.staticfiles',
	'trashure',
    # Uncomment the next line to enable the admin:
    # 'django.contrib.admin',
    # Uncomment the next line to enable admin documentation:
    # 'django.contrib.admindocs',
)

What is it?

Peeking inside C:\django\django\contrib\auth, we find another file called models.py. That sounds like a good place to look for clues.

The models.py has a class called “Permissions.” It goes like this:

class Permission(models.Model):
    """The permissions system provides a way to assign permissions to specific users and groups of users.

    The permission system is used by the Django admin site, but may also be useful in your own code. The Django admin site uses permissions as follows:

        - The "add" permission limits the user's ability to view the "add" form and add an object.
        - The "change" permission limits a user's ability to view the change list, view the "change" form and change an object.
        - The "delete" permission limits the ability to delete an object.

    Permissions are set globally per type of object, not per specific object instance. It is possible to say "Mary may change news stories," but it's not currently possible to say "Mary may change news stories, but only the ones she created herself" or "Mary may only change news stories that have a certain status or publication date."

    Three basic permissions -- add, change and delete -- are automatically created for each Django model.
    """
 name = models.CharField(_('name'), max_length=50) content_type = models.ForeignKey(ContentType) codename = models.CharField(_('codename'), max_length=100) objects = PermissionManager() 

    class Meta:
        verbose_name = _('permission')
        verbose_name_plural = _('permissions')
        unique_together = (('content_type', 'codename'),)
        ordering = ('content_type__app_label', 'content_type__model', 'codename')

    def __unicode__(self):
        return u"%s | %s | %s" % (
            unicode(self.content_type.app_label),
            unicode(self.content_type),
            unicode(self.name))
    def natural_key(self):
        return (self.codename,) + self.content_type.natural_key()
    natural_key.dependencies = ['contenttypes.contenttype']

Mystery and intrigue.

I would not expect some random source code file to make any sense at all. But wait…there are a ton of comments and explanations in there. I don’t know what it is or what it all means, but it seems strange and beautiful. Like this thing:

Another strange and beautiful creature.

So…how was school today?

I definitely don’t understand much, but have a sliver of hope that it will make sense eventually. Someday.